Key Concepts

Model Parameters:

These are internal variables such as weights and biases adjusted during training to minimize errors
Weights determine the importance of inputs in predicting outputs, while biases act as offsets to fine-tune predictions, helping the model learn patterns and improve accuracy

Loss Function:

A score of how wrong the model’s predictions are
Training tries to make this loss smaller
For example, if a model predicts "spam" with 90% confidence but the email is actually "not spam", the loss function calculates the error

Gradients:

Gradients measure how much a model's loss changes with small tweaks to inputs or parameters, guiding training or manipulation
In open-box attacks, attackers use gradients to identify the optimal input modifications to deceive the model

Hard Labels:

Scores:

In classification, scores are the raw outputs of a model before converting them into probabilities or labels
In closed-box attacks, these scores reveal the model's confidence, helping attackers understand its decision-making

Soft Labels:

Soft Labels are the probability distributions over all classes, indicating the model's confidence in each class
For example, [0.9, 0.1] might mean 90% confidence in class 0 and 10% in class 1

Decision Boundary:

An invisible line(or surface) in the input space that separates different class labels
The attacker aims to "cross" this boundary with minimal changes to the input

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Model Evasion		Model Evasion
Model Poisoning		Model Poisoning
README.md		README.md

Provide feedback