AML BigData Challenge

This is a self-paced project for SQL learning and practicing. The data is from IMI BIGDataAIHub, synthesized data from Scotiabank.

Process

I firstly import raw datasets into Microsoft Azure Database, and then use SQL Server Management Studio (SMSS) to process the csv files into SQL tables. After cleaning, I created an analysis table that contains kyc data and actual transaction records for explorative data analysis. In the EDA, I convert raw transaction number into several indicators such as income-transaction ratio, volatility, and high-value ratio. Then I used window function to compare every customer within their occupation group and combined it with indicators to tag suspicious customers.

Conclusions

The most suspicious customer criteria must be the kyc mismatch. For example, when the customer's occupation doesn't have a usual income (i.e. under 1000), but the total transaction amount is very big, or the high-value transactions are very frequent, then the customer's behaviour is not consistent.
Identifying fraud is not always by one hard threshold. I observed that a lot of non-fraud customers have high volatility or high-value transactions, so the high volume is not a fraud signal. However, when combining it with other indicators such as group deviation (window function) and transaction ratio, the suspicious candidates become fewer and more precise.
The global pattern doesn't contribute very much to AML analysis. During EDA, when I look at the entire table, no combination of indicators will perfectly separate fraud data from other non-fraud data. But when setting a particular scene, such as all customers from Ontario, have the same occupation, or have a similar total transaction amount, the rest of the indicators may distinguish fraud by abnormal spikes, a larger high-value ratio, or stable volatility compared to normal customers.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
sql		sql
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AML BigData Challenge

Process

Conclusions

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AML BigData Challenge

Process

Conclusions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages