I run this piece of code:
a = tf.data.Dataset.from_tensor_slices([1,1,1,1,1,1,1,1,1,1]) b = tf.data.Dataset.from_tensor_slices([2,2,2,2]) d = [a, b] weights = [0.2, 0.8] res = tf.data.experimental.sample_from_datasets(d, weights=weights) for elem in res: print(elem)
And get the result:
tf.Tensor(1, shape=(), dtype=int32) tf.Tensor(2, shape=(), dtype=int32) tf.Tensor(2, shape=(), dtype=int32) tf.Tensor(2, shape=(), dtype=int32) tf.Tensor(2, shape=(), dtype=int32) tf.Tensor(1, shape=(), dtype=int32) tf.Tensor(1, shape=(), dtype=int32) tf.Tensor(1, shape=(), dtype=int32) tf.Tensor(1, shape=(), dtype=int32) tf.Tensor(1, shape=(), dtype=int32) tf.Tensor(1, shape=(), dtype=int32) tf.Tensor(1, shape=(), dtype=int32) tf.Tensor(1, shape=(), dtype=int32) tf.Tensor(1, shape=(), dtype=int32)
That confused me. I thought it should be a undersample for both datasets a
and b
. And I should get a result like [1,1,2,2,2]
instead of containing all the original elements since it should be 0.2 * len(a) + 0.8 * len(b). Did I misunderstand the weighted sampling?? Thanks in advance!
没有评论:
发表评论