Skip to content

Coxabc/deweyr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

deweyr DeweyHex2

An R interface to deweypy for downloading files from the Dewey file management system.

Overview

deweyr provides a simple way to download files from Dewey folders directly from R, without requiring manual Python environment setup. The package offers two download methods:

  • dewey_download() - Recommended method using UV (automatic Python environment management)
  • dewey_download_py() - Traditional method using an existing Python installation

Installation

Install the development version from GitHub:

# install.packages("devtools")
devtools::install_github("Coxabc/deweyr")

Quick Start

Method 1: Using UV (Recommended)

The easiest way to get started - no Python installation required:

library(deweyr)

download_dewey(
  api_key = "your-api-key",
  folder_id = "your-folder-id"
)

First-time setup: If UV isn't installed, deweyr will install it automatically. You may see a message recommending you restart your terminal for optimal performance in future runs.

Method 2: Using Existing Python

If you already have Python and deweypy installed:

library(deweyr)

download_dewey_py(
  api_key = "your-api-key",
  folder_id = "your-folder-id"
)

Usage Examples

Basic Download

# Download to default location (./dewey-downloads)
download_dewey(
  api_key = "your-api-key",
  folder_id = "abc123"
)

Custom Download Location

# Specify where to save files
download_dewey(
  api_key = "your-api-key",
  folder_id = "abc123",
  download_path = "C:/Users/YourName/Documents/data"
)

Download from URL

You can use either a folder ID or the full Dewey URL:

download_dewey(
  api_key = "your-api-key",
  folder_id = "https://api.deweydata.io/api/v1/external/data/abc123"
)

Advanced Options

Multi-threaded Downloads

Adjust the number of workers for faster downloads (default is 8):

download_dewey(
  api_key = "your-api-key",
  folder_id = "abc123",
  num_workers = 16  # Use 16 parallel workers
)

Date-Partitioned Datasets

For datasets partitioned by date, you can filter which partitions to download:

# Download only data from 2024 onwards
download_dewey(
  api_key = "your-api-key",
  folder_id = "abc123",
  partition_key_after = "2024-01-01"
)

# Download only data up to a certain date
download_dewey(
  api_key = "your-api-key",
  folder_id = "abc123",
  partition_key_before = "2023-12-31"
)

# Download a specific date range
download_dewey(
  api_key = "your-api-key",
  folder_id = "abc123",
  partition_key_after = "2024-01-01",
  partition_key_before = "2024-03-31"
)

Duck DB Options

Download from Duck DB

download_dewey_duck(
  api_key = "your-api-key",
  data_id = "dataset-from-deweydata",
  partition = "column-name-to-partition-by",
  where = NULL,
  select = NULL,
  overwrite=FALSE
)

Read Using Duck DB

read_dewey_duck(
  path = "path-to-read-in-already-downloaded-data",
  where = NULL
)

Get Dewey URL

get_dewey_urls_duck(
  api_key = "your-api-key",
  data_id = "dataset-from-deweydata",
  preview = FALSE
)

Preview with Duck DB

preview_dewey_duck(
  api_key = "your-api-key",
  data_id = "dataset-from-deweydata",
  limit = 10,
  where = NULL
)

Note: This package requires an active Dewey account and API key. Visit Dewey to learn more.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors