2021年4月2日星期五

Different weights for datasets tf.data.experimental.sample_from_datasets

I run this piece of code:

a = tf.data.Dataset.from_tensor_slices([1,1,1,1,1,1,1,1,1,1])  b = tf.data.Dataset.from_tensor_slices([2,2,2,2])  d = [a, b]  weights = [0.2, 0.8]  res = tf.data.experimental.sample_from_datasets(d, weights=weights)  for elem in res:    print(elem)  

And get the result:

tf.Tensor(1, shape=(), dtype=int32)  tf.Tensor(2, shape=(), dtype=int32)  tf.Tensor(2, shape=(), dtype=int32)  tf.Tensor(2, shape=(), dtype=int32)  tf.Tensor(2, shape=(), dtype=int32)  tf.Tensor(1, shape=(), dtype=int32)  tf.Tensor(1, shape=(), dtype=int32)  tf.Tensor(1, shape=(), dtype=int32)  tf.Tensor(1, shape=(), dtype=int32)  tf.Tensor(1, shape=(), dtype=int32)  tf.Tensor(1, shape=(), dtype=int32)  tf.Tensor(1, shape=(), dtype=int32)  tf.Tensor(1, shape=(), dtype=int32)  tf.Tensor(1, shape=(), dtype=int32)  

That confused me. I thought it should be a undersample for both datasets a and b. And I should get a result like [1,1,2,2,2] instead of containing all the original elements since it should be 0.2 * len(a) + 0.8 * len(b). Did I misunderstand the weighted sampling?? Thanks in advance!

https://stackoverflow.com/questions/66926798/different-weights-for-datasets-tf-data-experimental-sample-from-datasets April 03, 2021 at 08:49AM

没有评论:

发表评论