2021年2月9日星期二

non-uniform length multivariable time series clustering with R

My dataset is like this:

df ID Day Score1, Score2, Score3, ..., Score12 1 1 3 4 3 2 1 5 3 3 3 1 1 10 2 3 2 1 1 30 2 1 1 0 2 6 4 3 2 4 2 50 4 3 2 2 3 8 2 4 2 4 3 40 2 3 2 4 3 70 2 3 0 2 ... 99979 2 4 3 3 2 99979 5 4 2 2 2 99979 67 4 2 0 1 99979 79 3 2 0 0 99990 3 1 4 3 1 99990 78 0 3 3 0 As in the example above, there are thousands of IDs and each ID has a different number of days on which to score. For example, ID 1 measures the score 4 times (day 1,5,10,30) and ID 3 measures the score 3 times (day 8, 40, 70). In this way, the number of measurements per ID is non-uniform length, and I want to do it in two clustering methods.

The first method has 12 dimensions from Δscore1 to Δscore12, and I want to clustering which of the 6 clusters each ID belongs to. For example, ID1 is clustering with 12 Δscore values ​​(12 dimensions) from the change according to the day of score 1 to the change according to the day of score 12. (In case of clustering of ID1, Δscore 1=3->3->2->2, Δscore2=4->3->3->1, Δscore12=2->1->1->0, is to do clustering.) 2.The second clustering method is to bundle Δscore1 to Δscore3 into one Δscore_A, Δscore4 to Δscore9 with Δscore_B, and Δscore 10 to Δscore12 with Δscore _C to cluster in three dimensions of Δscore_A, B, and C. For example, when ID1 is clustered, Δscore_A = [Δscore 1, Δscore 2, Δscore 3], Δscore_B, Δscore_C and Δscore 1 to Δscore 12 are grouped into Δscore_A, Δscore_B, Δscore_C 3 large items, and each ID is clustered. It belongs to clustering. I am curious how to do the above clustering using R.

https://stackoverflow.com/questions/66130776/non-uniform-length-multivariable-time-series-clustering-with-r February 10, 2021 at 12:06PM

没有评论:

发表评论