Skip to content

Andrewwf1/AML-challenge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

AML BigData Challenge

This is a self-paced project for SQL learning and practicing. The data is from IMI BIGDataAIHub, synthesized data from Scotiabank.

Process

I firstly import raw datasets into Microsoft Azure Database, and then use SQL Server Management Studio (SMSS) to process the csv files into SQL tables. After cleaning, I created an analysis table that contains kyc data and actual transaction records for explorative data analysis. In the EDA, I convert raw transaction number into several indicators such as income-transaction ratio, volatility, and high-value ratio. Then I used window function to compare every customer within their occupation group and combined it with indicators to tag suspicious customers.

Conclusions

  1. The most suspicious customer criteria must be the kyc mismatch. For example, when the customer's occupation doesn't have a usual income (i.e. under 1000), but the total transaction amount is very big, or the high-value transactions are very frequent, then the customer's behaviour is not consistent.
  2. Identifying fraud is not always by one hard threshold. I observed that a lot of non-fraud customers have high volatility or high-value transactions, so the high volume is not a fraud signal. However, when combining it with other indicators such as group deviation (window function) and transaction ratio, the suspicious candidates become fewer and more precise.
  3. The global pattern doesn't contribute very much to AML analysis. During EDA, when I look at the entire table, no combination of indicators will perfectly separate fraud data from other non-fraud data. But when setting a particular scene, such as all customers from Ontario, have the same occupation, or have a similar total transaction amount, the rest of the indicators may distinguish fraud by abnormal spikes, a larger high-value ratio, or stable volatility compared to normal customers.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages