I have two text files. One file is .txt and the other is .csv. The .csv file has one additional column (with NA values). All the others are the same. When I read those files with the commands:
subject.info = read.table(paste(data_dir, "outd01_all_subject_info.txt", sep = slash), header=TRUE) subject.info = read.csv("data_d01_features/outd01_all_subject_info2.txt", sep = ',', header=TRUE, stringsAsFactors = F)
The dataframe subject.info looks the same, but when I run:
as.matrix(subject.info)
All the data in the second file are converted to strings:
SUBJID Sex age trauma_age ptsd [1,] "600039015048" "2" "11" NA "0" [2,] "600110937794" "1" "10" NA "0" [3,] "600129552715" "1" "11" " 8" "2" [4,] "600210241146" "1" "18" "16" "2" [5,] "600294620965" "1" "13" NA "0" [6,] "600409285352" "2" "16" "15" "1" [7,] "600460215379" "1" "10" NA "0" [8,] "600547831711" "1" "10" " 6" "1" [9,] "600561317124" "2" "19" "19" "1" [10,] "600635899969" "2" "11" NA "0" [11,] "600647003585" "1" "18" NA "0" [12,] "600682103788" "1" "18" "15" "2" [13,] "600689706588" "1" "16" "15" "2" [14,] "600747749665" "2" " 9" " 7" "1"
This does not happen for the first file:
SUBJID Sex age ptsd [1,] 600039015048 2 10 0 [2,] 600110937794 1 9 0 [3,] 600129552715 1 10 2 [4,] 600210241146 1 17 2 [5,] 600294620965 1 13 0 [6,] 600409285352 2 15 1 [7,] 600460215379 1 8 0 [8,] 600547831711 1 8 1 [9,] 600561317124 2 19 1 [10,] 600635899969 2 11 0 [11,] 600647003585 1 19 0 [12,] 600682103788 1 18 2 [13,] 600689706588 1 15 2 [14,] 600747749665 2 8 1
Is this due to the NA values? But when I replace NAs with 0 in the second file, the problem still exists:
SUBJID Sex age trauma_age ptsd [1,] "600039015048" "2" "11" " 0" "0" [2,] "600110937794" "1" "10" " 0" "0" [3,] "600129552715" "1" "11" " 8" "2" [4,] "600210241146" "1" "18" "16" "2" [5,] "600294620965" "1" "13" " 0" "0" [6,] "600409285352" "2" "16" "15" "1" [7,] "600460215379" "1" "10" " 0" "0" [8,] "600547831711" "1" "10" " 6" "1" [9,] "600561317124" "2" "19" "19" "1" [10,] "600635899969" "2" "11" " 0" "0" [11,] "600647003585" "1" "18" " 0" "0" [12,] "600682103788" "1" "18" "15" "2" [13,] "600689706588" "1" "16" "15" "2" [14,] "600747749665" "2" " 9" " 7" "1"
And this problem still exists if I convert the second file to .csv file, nor if I use read.table, or read.csv2
https://stackoverflow.com/questions/66002580/why-as-matrix-convert-numeric-values-to-string-in-data-frames February 02, 2021 at 09:44AM
没有评论:
发表评论