kaggle action

Execute CI/CD with kaggle. You can use the free kaggle GPU resource to complete the test. This Action is inspired by lvyufeng/action-kaggle-gpu-test and namiyousef/action-kaggle-gpu-test.

Feature

Kaggle provides a series of remote control tools and free GPU resources. And these resources can be used in CI/CD after combination.

For free users, kaggle will provide more than 30 hours of GPU usage per week, which is enough to provide testing for some small projects. For some open source projects, using free resources instead of renting GPU VMs yourself can save a lot of money.

Usage

Before using this Action, you need a kaggle account.

In order to avoid abuse of server resources, kaggle may require you to use your mobile phone number for verification. If your network is unavailable or the GPU is unavailable during execution, it may be that kaggle restricts the use of unauthenticated users.
Then go to your Settings > API Tokens page. Click the "Generate New Token" button to create a new API token. Copy it to your clipboard.
Add the API token to your GitHub repository's secrets. You can name the secret KAGGLE_API_TOKEN or any other name you like. Make sure to keep it secret and do not share it with anyone.

If you want to use the legacy API credentials, you can set KAGGLE_USERNAME and KAGGLE_KEY secrets. However, it's recommended to use the API token instead of the legacy credentials.
Then go to your account page. Create your API Token. You'll get a file with something like this:
{
  "username": "USERNAME",
  "key": "TOKEN"
}
Add USERNAME and TOKEN to the secret of your GitHub repository respectively.

Then create your workflows file, for example:

name: kaggle gpu test
on:
  push: [master, main]
jobs:
  kaggle-ci:
    name: kaggle CI
    runs-on: ubuntu-latest
    steps:
      - name: Checkout repository
        uses: actions/checkout@v3
      - name: Run kaggle
        uses: Frederisk/kaggle-action@v2
        with:
          api_token: ${{ secrets.KAGGLE_API_TOKEN }}
          # or legacy API credentials
          # username: ${{ secrets.KAGGLE_USERNAME }}
          # key: ${{ secrets.KAGGLE_TOKEN }}
          # The name of the kaggle used for testing, take a new one.
          # Try to avoid underscores, spaces or other special characters.
          title: KaggleTestCI
          # The location of your test script, which we will write next.
          code_file: .github/script/gpu_runner.py
          # and so on, you can set other parameters as needed.

Finally, you can write your own script to test. In particular, the script will be executed on Kaggle's server, not GitHub Action's server, so you may also need to clone the repository to the server. In a python script, you can execute external commands through functions such as os.system, subprocess.call, subprocess.run, etc. Here's a simple example:

import os
import subprocess

def callsh(command):
  status = subprocess.run(command)
  status.check_returncode()
  print(status.stdout)

callsh(['git', 'clone', 'https://github.com/name/repo_name'])
os.chdir('repo_name')
callsh(['bash', 'scripts/setup.sh'])
callsh(['conda', 'create', '-n', 'testenv', 'python=3.8.12', 'cudatoolkit=9.2', 'cudnn', '-y'])
callsh(['/opt/conda/envs/testenv/bin/pip', 'install', '-r', 'requirements.txt'])
callsh(['/opt/conda/envs/testenv/bin/pytest', 'tests'])
# ......

Parameters

These parameters are slightly different from the kaggle api, but the kaggle api's docs may still be informative.

api_token: Your kaggle api token. If you have set this parameter, key and username will be ignored.
username: Your kaggle username. Notice that this is not your display name. If api_token has been set and this parameter has value, then this parameter will be forced to override the current kaggle kernel owner.
key: Your kaggle legacy API credentials. It's recommended to use api_token instead of key and username. At least one of api_token and key is required. If both are set, api_token will be used.
id: The slug of the kernel. If not set, it will be generated from the title. Please try to avoid underscores, spaces or other special characters in the title, as they may cause problems when generating slugs.
title: Required. The title of the kernel. Please be aware that kernel titles and slugs are linked to each other. A kernel slug is always the title lowercased with dashes (-) Replacing spaces.
code_file: Required. The path to your kernel source code.
language: Default value is python. The language your kernel is written in. Valid options are python, r, and rmarkdown.
kernel_type: Default value is script. The type of kernel. Valid options are script and notebook.
is_private: Default value is true. Whether or not the kernel should be private. true to make the kernel private, false to make it public.
enable_gpu: Default value is enable. Whether or not the kernel should run on a GPU. enable to run on the GPU, otherwise not.
enable_tpu: Default value is disable. Whether or not the kernel should run on a TPU. enable to run on the TPU, otherwise not.
enable_internet: Default value is enable. Whether or not the kernel should be able to access the internet. enable to use the internet, otherwise not.
dataset_sources: A list of invalid dataset sources. It's a an array of strings separated by newlines:
```
  dataset_sources: |
    dataset1
    dataset2
```
competition_sources: A list of invalid competition sources. It's a an array of strings separated by newlines.
kernel_sources: A list of invalid kernel sources. It's a an array of strings separated by newlines.
keywords: The keywords of the kernel. It's a an array of strings separated by newlines.
docker_image: Which docker image to run with.
docker_image_pinning_type: Which docker image to use for executing new versions going forward.
machine_shape: The accelerator to be added to the kernel to train it. It's will override enable_gpu and enable_tpu if specified. Defualt value is empty, which means no accelerator will be added. Check this page for the supported machine shapes.
timeout_seconds: The timeout for pushing the kernel. The default value is not specified.
fetch_time_seconds: The time interval for fetching the kernel status. The default value is 3 seconds.

Known Issues

When the title has an underscore, the status of the execution instance may not be obtained.
The kaggle's server is not a real virtual machine, it is actually executed in docker. So some system commands or programs cannot work properly. For example, trying to start docker (service docker start) will get an error: cannot create directory "cpuset".

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
.github/workflows		.github/workflows
.vscode		.vscode
__tests__		__tests__
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
action.yml		action.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kaggle action

Feature

Usage

Parameters

Known Issues

About

Uh oh!

Releases 3

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

kaggle action

Feature

Usage

Parameters

Known Issues

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 3

Uh oh!

Contributors

Uh oh!

Languages