-
Install Stack.
-
Clone Hasktorch.
-
Clone Hasktorch-tools.
-
Change the torch paths in
stack.yamlto the correct paths on your system. -
Configure .bashrc Add to your .bashrc:
HASKTORCH="~/.local/lib/hasktorch"In the cloned Hasktorch repository, write this series of command into the terminal:
source setenv cat $LD_LIBRARY_PATH
and copy the output and add it to your .bashrc like this:
export LD_LIBRARY_PATH=<LD_LIBRARY_PATH that you copied>
for linear regression:
stack run linearfor titanic:
stack run titanicHyperparameters
- Input Layer 1
- Hidden Layer 1
- Output Layer 1
- Learning Rate 0.01
- GD Optimizer:
- Test Set: 1.0
- Training Set: 0.8
- You can find the Titanic dataset here.
Hyperparameters:
- Input Layer: 7
- Hidden Layer: 21
- Output Layer: 1
- Learning Rate: 0.01
-
GD Optimizer:
- Kaggle Test Set: 0.62200
- Training Set: 0.61616
-
Adam Optimizer:
- Kaggle Test Set: 0.74641
- Training Set: 0.79904
Hyperparameters:
- Input Layer: 3074
- Hidden Layer: 256
- Output Layer: 256
- Learning Rate: 0.001
- Adam Optimizer:
Epoch Loss Kaggle Accuracy ValidData Accuracy F1 Macro 50 3258.0178 - 0.4054 0.3384 550 2613.6038 - 0.4978 0.4967 1050 2321.2322 - 0.5046 0.5016 1550 2170.7373 - 0.5084 0.5063 2050 1906.2094 - 0.5088 0.5104 2550 1774.9410 - 0.5182 0.5184 3050 1658.4749 - 0.5096 0.5094 3550 1482.1902 - 0.5166 0.5176 3850 2347.8490 0.5118 0.5100 0.5071
For Multi-class classification it is optimal to feed the data randomly to the model when training
Use the Confusion Matrix to analyze the data
Store the loss value validation on the and the loss value on the training set to analyze the model comparing both curves
When Evaluating also store the Precision and the Recall to compare to the F1 scores
Save the embedding values as parameters
two techniques of tokenization: subword and part of speech
Bias are used to recognize the frequency of the words, usefull for detecting the style of the text.
While removing the bias, the model will be able to detect better the meaning of the words.
negative sampling: giving the model a word that is not in the context of the sentence to say that the word is not in the context of the sentence.
Better to use the negative data than positive data to train the model.
Transformers are used today.
use pre-trained models to train the model faster.
all the words that are not in the top 1000 words are considered as unknown words.
add hasBias as a paramater in MLPHyperParams
to input multiple words in word2vec model make a onehotvector like w1 = [0,1,0], w2 = [1,0,0] and make w1 + w2 = [1,1,0]
create issue for the saving



