2021年2月2日星期二

why as.matrix convert numeric values to string in data.frames [duplicate]

I have two text files. One file is .txt and the other is .csv. The .csv file has one additional column (with NA values). All the others are the same. When I read those files with the commands:

subject.info = read.table(paste(data_dir, "outd01_all_subject_info.txt", sep = slash), header=TRUE)    subject.info = read.csv("data_d01_features/outd01_all_subject_info2.txt", sep = ',', header=TRUE, stringsAsFactors = F)  

The dataframe subject.info looks the same, but when I run:

as.matrix(subject.info)  

All the data in the second file are converted to strings:

     SUBJID         Sex age  trauma_age ptsd    [1,] "600039015048" "2" "11" NA         "0"     [2,] "600110937794" "1" "10" NA         "0"     [3,] "600129552715" "1" "11" " 8"       "2"     [4,] "600210241146" "1" "18" "16"       "2"     [5,] "600294620965" "1" "13" NA         "0"     [6,] "600409285352" "2" "16" "15"       "1"     [7,] "600460215379" "1" "10" NA         "0"     [8,] "600547831711" "1" "10" " 6"       "1"     [9,] "600561317124" "2" "19" "19"       "1"    [10,] "600635899969" "2" "11" NA         "0"    [11,] "600647003585" "1" "18" NA         "0"    [12,] "600682103788" "1" "18" "15"       "2"    [13,] "600689706588" "1" "16" "15"       "2"    [14,] "600747749665" "2" " 9" " 7"       "1"   

This does not happen for the first file:

       SUBJID Sex age ptsd    [1,] 600039015048   2  10    0    [2,] 600110937794   1   9    0    [3,] 600129552715   1  10    2    [4,] 600210241146   1  17    2    [5,] 600294620965   1  13    0    [6,] 600409285352   2  15    1    [7,] 600460215379   1   8    0    [8,] 600547831711   1   8    1    [9,] 600561317124   2  19    1   [10,] 600635899969   2  11    0   [11,] 600647003585   1  19    0   [12,] 600682103788   1  18    2   [13,] 600689706588   1  15    2   [14,] 600747749665   2   8    1  

Is this due to the NA values? But when I replace NAs with 0 in the second file, the problem still exists:

       SUBJID         Sex age  trauma_age ptsd    [1,] "600039015048" "2" "11" " 0"       "0"     [2,] "600110937794" "1" "10" " 0"       "0"     [3,] "600129552715" "1" "11" " 8"       "2"     [4,] "600210241146" "1" "18" "16"       "2"     [5,] "600294620965" "1" "13" " 0"       "0"     [6,] "600409285352" "2" "16" "15"       "1"     [7,] "600460215379" "1" "10" " 0"       "0"     [8,] "600547831711" "1" "10" " 6"       "1"     [9,] "600561317124" "2" "19" "19"       "1"    [10,] "600635899969" "2" "11" " 0"       "0"    [11,] "600647003585" "1" "18" " 0"       "0"    [12,] "600682103788" "1" "18" "15"       "2"    [13,] "600689706588" "1" "16" "15"       "2"    [14,] "600747749665" "2" " 9" " 7"       "1"   

And this problem still exists if I convert the second file to .csv file, nor if I use read.table, or read.csv2

https://stackoverflow.com/questions/66002580/why-as-matrix-convert-numeric-values-to-string-in-data-frames February 02, 2021 at 09:44AM

没有评论:

发表评论