Skip to content

Commit b2f6846

Browse files
committed
spec changes
1 parent b92b806 commit b2f6846

File tree

1 file changed

+11
-2
lines changed

1 file changed

+11
-2
lines changed

docs/homeworks/hw4.md

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -549,8 +549,11 @@ Subtasks:
549549
a `tgt_mask` and a `src_mask` here. `tgt_mask` has both the causal mask and the pad mask applied
550550
for the English input into the Decoder. `src_mask` has the pad mask applied to it.
551551
552-
You'll need to think about where to input the `src_mask` vs the `tgt_mask` (hint: the only function
553-
that actually deploys any masks is the `scaled_dot_product_attention` function)
552+
You'll need to think about where to input the `src_mask` vs the `tgt_mask` (hint: the only function
553+
that actually deploys any masks is the `scaled_dot_product_attention` function)
554+
555+
Remember that our LM task will be decoder-only, so we don't want to do cross-attention in this case.
556+
When `enc_x` is `None`, make sure to skip the cross-attention step in your `DecoderLayer`.
554557
555558
* Implement `Decoder`. This will be a `ModuleList` of your `DecoderLayer`s, just like in the `Encoder`.
556559
It will also need to handle the target embeddings and positional encoding.
@@ -602,6 +605,12 @@ We've implemented the LM training script for you! Just add the same line
602605
that you added in the NMT task in the `TODO` line in
603606
`scripts/train_lm.py`.
604607

608+
## Tracking experiments
609+
610+
We've added simple `wandb` logging to your training scripts.
611+
Make sure to fill in your entity names in both scripts to track
612+
your experiments!
613+
605614
## Start training!
606615
607616
1. Set your devices to be different values (based on which GPUs

0 commit comments

Comments
 (0)