Skip to content

Manikandan-pt/Adversarial-ML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Key Concepts

Model Parameters:

  • These are internal variables such as weights and biases adjusted during training to minimize errors
  • Weights determine the importance of inputs in predicting outputs, while biases act as offsets to fine-tune predictions, helping the model learn patterns and improve accuracy

Loss Function:

  • A score of how wrong the model’s predictions are
  • Training tries to make this loss smaller
  • For example, if a model predicts "spam" with 90% confidence but the email is actually "not spam", the loss function calculates the error

Gradients:

  • Gradients measure how much a model's loss changes with small tweaks to inputs or parameters, guiding training or manipulation
  • In open-box attacks, attackers use gradients to identify the optimal input modifications to deceive the model

Hard Labels:

  • The final answer(class label) the model picks, without showing probabilities

Scores:

  • In classification, scores are the raw outputs of a model before converting them into probabilities or labels
  • In closed-box attacks, these scores reveal the model's confidence, helping attackers understand its decision-making

Soft Labels:

  • Soft Labels are the probability distributions over all classes, indicating the model's confidence in each class
  • For example, [0.9, 0.1] might mean 90% confidence in class 0 and 10% in class 1

Decision Boundary:

  • An invisible line(or surface) in the input space that separates different class labels
  • The attacker aims to "cross" this boundary with minimal changes to the input

About

Adversarial Machine Learning Attacks - fundamentals, algorithms, techniques and tooling

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors