You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I start training using examples/hubert/README.md, first perform data preparation. Then it is said that the format of {train,valid}.tsv is.:
<root-dir>
<audio-path-1>
<audio-path-2>
...
In my understanding, that is.
But in the process of handling and training, I found that the.tsv file seems to have a value in addition to the path, such as "sz" in the following code. :
def load_audio(manifest_path, max_keep, min_keep):
n_long, n_short = 0, 0
names, inds, sizes = [], [], []
with open(manifest_path) as f:
root = f.readline().strip()
for ind, line in enumerate(f):
items = line.strip().split("\t")
assert len(items) == 2, line
sz = int(items[1])
if min_keep is not None and sz < min_keep:
n_short += 1
elif max_keep is not None and sz > max_keep:
n_long += 1
else:
names.append(items[0])
inds.append(ind)
sizes.append(sz)
tot = ind + 1
logger.info(
(
f"max_keep={max_keep}, min_keep={min_keep}, "
f"loaded {len(names)}, skipped {n_short} short and {n_long} long, "
f"longest-loaded={max(sizes)}, shortest-loaded={min(sizes)}"
)
)
return root, names, inds, tot, sizes
So, I want to know what exactly the format of this data is. The first item is the path. What is the second item and how can it be obtained?
The text was updated successfully, but these errors were encountered:
When I start training using examples/hubert/README.md, first perform data preparation. Then it is said that the format of {train,valid}.tsv is.:
In my understanding, that is.
But in the process of handling and training, I found that the.tsv file seems to have a value in addition to the path, such as "sz" in the following code. :
def load_audio(manifest_path, max_keep, min_keep):
n_long, n_short = 0, 0
names, inds, sizes = [], [], []
with open(manifest_path) as f:
root = f.readline().strip()
for ind, line in enumerate(f):
items = line.strip().split("\t")
assert len(items) == 2, line
sz = int(items[1])
if min_keep is not None and sz < min_keep:
n_short += 1
elif max_keep is not None and sz > max_keep:
n_long += 1
else:
names.append(items[0])
inds.append(ind)
sizes.append(sz)
tot = ind + 1
logger.info(
(
f"max_keep={max_keep}, min_keep={min_keep}, "
f"loaded {len(names)}, skipped {n_short} short and {n_long} long, "
f"longest-loaded={max(sizes)}, shortest-loaded={min(sizes)}"
)
)
return root, names, inds, tot, sizes
So, I want to know what exactly the format of this data is. The first item is the path. What is the second item and how can it be obtained?
The text was updated successfully, but these errors were encountered: