Rework PIT example train.py and data.py by sibange · Pull Request #125 · fgnt/padertorch

sibange · 2021-11-08T14:56:29Z

No description provided.

thequilo · 2021-11-09T07:50:28Z

-    audio_keys = ['observation', 'speech_source']
+def prepare_dataset(db, dataset_name: str, batch_size, prefetch=True, shuffle=True):
+    """
+    Prepares the dataset for the training process (loading audio data, SFTF)


Typo: SFTF -> STFT

thequilo · 2021-11-09T07:51:01Z

+        shuffle: should the data be shuffeled
+
+    Returns:
+        desired dataset of the database in prepared for the training


Something is wrong with the grammar in this sentence

thequilo · 2021-11-09T07:52:36Z

+        _config: Configuration dict of the experiment
+        _run: Run object of the current run of the experiment
+
+    Returns:


This can be left out when there is not return value

thequilo · 2021-11-09T07:54:53Z

+        None
+    """
+    init(_config, _run)
+    (trainer, train_dataset, validate_dataset) = prepare(_config)


The parentheses on the left-hand-side are redundant

thequilo · 2021-11-09T07:57:00Z

+    # Test run to detects possible errors in the trainer/datasets
+    trainer.test_run(train_dataset, validate_dataset)
+
+    # path where the checkpoints of the training are stored


This comment is lower case, others are upper case. Stick to one (I prefer upper case)

TCord · 2021-11-09T08:00:38Z

    if shuffle:
        dataset = dataset.shuffle(reshuffle=True)
+
+    #Splitting the dataset in batches and sorting the frames in the batch


Better write "... and sorts examples in a batch w.r.t. their duration" or something similar.
The frames themselves are not sorted

TCord · 2021-11-09T08:02:19Z

-def pre_batch_transform(inputs, return_keys=None):
+def pre_batch_transform(inputs):
+    """
+    Prepares the data through creating a dictionary with various data, which is computed through STFT.


"... by creating a dictionary with all data that is necessary for the model (e.g. STFT of observation)"

TCord · 2021-11-09T08:06:44Z

-    """ Prepares the train and validation dataset from the database object """
+def prepare(_config):
+    """
+    Preparation of the train and validation datasets for the training and initialisation of the padertorch trainer,


We try to stick to American English. intitialisation -> initialization

TCord · 2021-11-09T08:07:12Z

+    database_json = _config['database_json']

-    sacred.commands.print_config(_run)
+    # Initialisation of the trainer


intitialisation -> initialization

TCord · 2021-11-09T08:09:45Z

+    checkpoint_path = trainer.checkpoint_dir / 'ckpt_latest.pth'
+
+    # Start of the training
+    trainer.register_validation_hook(validate_dataset)


Could you repeat the most important default arguments of the validation hook, so that it becomes clear, what options can be easily modified for the validation (number of checkpoints, metric for the best checkpoint, ...)

thequilo · 2021-11-11T07:14:15Z

Now that the maybe_add function disappeared from the data preparation, can you add an example_to_device method to the model that only transfers to the GPU those keys from the example that are required for training? You can use pt.data.batch.example_to_device for that

sibange added 2 commits November 8, 2021 15:49

Rework PIT example train.py and data.py

55c9a68

minor fixes PIT example

3906c9b

thequilo requested review from TCord and thequilo November 8, 2021 15:21

thequilo reviewed Nov 9, 2021

View reviewed changes

TCord reviewed Nov 9, 2021

View reviewed changes

sibange added 2 commits November 10, 2021 23:12

Update data.py

3ec3a6e

Update train.py

494692d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Rework PIT example train.py and data.py#125

Rework PIT example train.py and data.py#125
sibange wants to merge 4 commits into
fgnt:masterfrom
sibange:master

sibange commented Nov 8, 2021

Uh oh!

thequilo Nov 9, 2021

Uh oh!

thequilo Nov 9, 2021

Uh oh!

thequilo Nov 9, 2021

Uh oh!

thequilo Nov 9, 2021

Uh oh!

thequilo Nov 9, 2021

Uh oh!

TCord Nov 9, 2021

Uh oh!

TCord Nov 9, 2021

Uh oh!

Uh oh!

TCord Nov 9, 2021

Uh oh!

TCord Nov 9, 2021

Uh oh!

TCord Nov 9, 2021

Uh oh!

thequilo commented Nov 11, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

sibange commented Nov 8, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

thequilo commented Nov 11, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants