-
Notifications
You must be signed in to change notification settings - Fork 119
feat: add time-series-preprocessor agent kit #146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: old-main
Are you sure you want to change the base?
Changes from all commits
363eff2
3a9a440
4985047
b726eff
03d6440
2b72304
651149f
3fadeaa
4d9842b
268d758
1a7c4a2
07cb91a
d9de8ed
7476d1d
b815c93
dbe2d0c
fb49fb5
acda769
67a1de7
20e1d8d
6f236ea
d2d53a6
a2af7de
961aecd
574df96
7f30ac5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,4 @@ | ||
| TIME_SERIES_PREPROCESSOR="Your Flow ID from Lamatic Studio" | ||
| LAMATIC_API_URL="Your API Endpoint URL" | ||
| LAMATIC_PROJECT_ID="Your Project ID" | ||
| LAMATIC_API_KEY="Your API Key" |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,29 @@ | ||
| # See https://help.github.com/articles/ignoring-files/ for more about ignoring files. | ||
|
|
||
| # dependencies | ||
| /node_modules | ||
|
|
||
| # next.js | ||
| /.next/ | ||
| /out/ | ||
|
|
||
| # production | ||
| /build | ||
|
|
||
| # debug | ||
| npm-debug.log* | ||
| yarn-debug.log* | ||
| yarn-error.log* | ||
| .pnpm-debug.log* | ||
|
|
||
| # env files | ||
| .env | ||
|
|
||
| # vercel | ||
| .vercel | ||
|
|
||
| # typescript | ||
| *.tsbuildinfo | ||
| next-env.d.ts | ||
|
|
||
| .env.local |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,162 @@ | ||
| # Time-Series Preprocessor — Lamatic AgentKit | ||
|
|
||
| An automation kit that analyzes time-series dataset schemas and generates production-ready Python preprocessing pipelines using `pandas` and `scikit-learn`. Paste a JSON summary of your dataset and receive executable code in seconds. | ||
|
|
||
| --- | ||
|
|
||
| ## What It Does | ||
|
|
||
| By providing a JSON summary of your dataset, the agent generates a complete Python script that handles: | ||
|
|
||
| - **Missing value imputation** — forward-fill, mean, and median strategies selected based on column type | ||
| - **Feature scaling** — MinMaxScaler or StandardScaler applied appropriately | ||
| - **Datetime parsing and index management** — automatic timestamp detection and alignment | ||
| - **Categorical encoding** — label or one-hot encoding based on cardinality | ||
| - **Standardized implementation** — clean, readable `pandas` + `scikit-learn` code ready to run | ||
|
|
||
| --- | ||
|
|
||
| ## Project Background | ||
|
|
||
| This kit was developed to solve the repetitive nature of data cleaning in time-series projects. | ||
|
|
||
| The concept originated during the development of a **water demand forecasting model** for a college located in a rural area near Bhopal. Due to aging sensor hardware and inconsistent data streams, significant time was spent manually handling missing values and aligning disparate data sources — rainfall readings, local water levels, and consumption logs from different systems with mismatched timestamps. | ||
|
|
||
| The goal was to automate the boilerplate preprocessing code, allowing engineers to focus on model performance rather than manual cleanup. This kit is the result of that experience. | ||
|
|
||
| --- | ||
|
|
||
| ## Who Is It For | ||
|
|
||
| Data engineers and machine learning engineers who frequently work with time-series data and need a fast way to generate reliable, repeatable preprocessing pipelines without writing boilerplate code from scratch. | ||
|
|
||
| --- | ||
|
|
||
| ## Tech Stack | ||
|
|
||
| | Tool | Role | | ||
| |---|---| | ||
| | [Lamatic.ai](https://lamatic.ai) | Flow orchestration and Edge deployment | | ||
| | Gemini 2.5 Pro | AI-driven Python code generation | | ||
| | Next.js 14 | Interactive frontend sandbox | | ||
| | pandas + scikit-learn | Target libraries for generated pipelines | | ||
|
|
||
| --- | ||
|
|
||
| ## Setup | ||
|
|
||
| ### 1. Build and Deploy Flow in Lamatic Studio | ||
|
|
||
| 1. Sign in at [lamatic.ai](https://lamatic.ai) | ||
| 2. Create a new project (if you don't have one) | ||
| 3. Click **+ New Flow** and select **API Request** as the trigger | ||
| 4. Add a **Generate Text** node and select **Gemini 2.5 Pro** | ||
| 5. Set the input variable to `dataset_summary` | ||
| 6. Configure the system prompt to act as an expert data engineer | ||
| 7. Deploy the flow and copy your credentials from the Studio dashboard | ||
|
|
||
| ### 2. Environment Variables | ||
|
|
||
| Create a `.env.local` file in the kit root directory: | ||
|
|
||
| ```env | ||
| TIME_SERIES_PREPROCESSOR="Your Flow ID from Lamatic Studio" | ||
| LAMATIC_API_URL="Your API Endpoint URL" | ||
| LAMATIC_PROJECT_ID="Your Project ID" | ||
| LAMATIC_API_KEY="Your API Key" | ||
| ``` | ||
|
|
||
| | Variable | Where to Find It | | ||
| |---|---| | ||
| | `TIME_SERIES_PREPROCESSOR` | Studio → Your Flow → Settings | | ||
| | `LAMATIC_API_URL` | Studio → Your Flow → API Endpoint | | ||
| | `LAMATIC_PROJECT_ID` | Studio → Project Settings | | ||
| | `LAMATIC_API_KEY` | Studio → Project Settings → API Keys | | ||
|
|
||
| ### 3. Install and Run | ||
|
|
||
| ```bash | ||
| npm install | ||
| npm run dev | ||
| ``` | ||
|
|
||
| Open [http://localhost:3000](http://localhost:3000) to access the frontend interface. | ||
|
|
||
| --- | ||
|
|
||
| ## Example Input | ||
|
|
||
| Provide a JSON object describing your dataset structure: | ||
|
|
||
| ```json | ||
| { | ||
| "dataset_name": "sensor_readings", | ||
| "frequency": "1min", | ||
| "columns": [ | ||
| {"name": "timestamp", "type": "datetime"}, | ||
| {"name": "temperature", "type": "float", "missing_pct": 5}, | ||
| {"name": "pressure", "type": "float", "missing_pct": 2}, | ||
| {"name": "status", "type": "categorical", "missing_pct": 0} | ||
| ], | ||
| "rows": 50000, | ||
| "target_column": "temperature" | ||
| } | ||
| ``` | ||
|
|
||
| ## Example Output | ||
|
|
||
| The agent returns a fully executable Python script: | ||
|
|
||
| ```python | ||
| import pandas as pd | ||
| from sklearn.preprocessing import MinMaxScaler | ||
|
|
||
| # Load and index | ||
| df = pd.read_csv("sensor_readings.csv", parse_dates=["timestamp"]) | ||
| df.set_index("timestamp", inplace=True) | ||
|
|
||
| # Impute missing values | ||
| df["temperature"].fillna(method="ffill", inplace=True) | ||
| df["pressure"].fillna(df["pressure"].mean(), inplace=True) | ||
|
|
||
| # Scale numerical features | ||
| scaler = MinMaxScaler() | ||
| df[["temperature", "pressure"]] = scaler.fit_transform(df[["temperature", "pressure"]]) | ||
|
|
||
| print("Preprocessing complete.") | ||
| print(df.head()) | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## Project Structure | ||
|
|
||
| ````text | ||
| time-series-preprocessor/ | ||
| ├── actions/ | ||
| │ └── orchestrate.ts # Server action calling the Lamatic flow | ||
| ├── app/ | ||
| │ └── page.tsx # Main UI — input form and output display | ||
| ├── components/ | ||
| │ └── ui/ # shadcn/ui components | ||
| ├── flows/ | ||
| │ └── time-series-preprocessor/ | ||
| │ ├── config.json # Exported Lamatic flow graph | ||
| │ ├── inputs.json # Input schema definition | ||
| │ └── meta.json # Flow metadata | ||
| ├── lib/ | ||
| │ └── lamatic-client.ts # Lamatic SDK client | ||
| ├── .env.example # Environment variable template | ||
| ├── config.json # Kit metadata | ||
| └── README.md | ||
| ```` | ||
|
|
||
| --- | ||
|
|
||
| ## Contributing | ||
|
|
||
| Contributions are welcome. Open an issue or pull request in the [AgentKit repository](https://github.com/Lamatic/AgentKit). | ||
|
|
||
| ## License | ||
|
|
||
| MIT License — see [LICENSE](../../../LICENSE). |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,31 @@ | ||
| "use server"; | ||
|
|
||
| import { createLamaticClient } from "@/lib/lamatic-client"; | ||
|
|
||
| const client = createLamaticClient(); | ||
|
|
||
| export async function preprocessTimeSeries(datasetSummary: string) { | ||
| if (!process.env.TIME_SERIES_PREPROCESSOR) { | ||
| throw new Error("TIME_SERIES_PREPROCESSOR environment variable is not set"); | ||
| } | ||
|
|
||
| try { | ||
| const response = await client.executeFlow({ | ||
| flowId: process.env.TIME_SERIES_PREPROCESSOR, | ||
| inputs: { | ||
| dataset_summary: datasetSummary, | ||
| }, | ||
| }); | ||
|
|
||
| return { | ||
| success: true, | ||
| result: response?.data?.generatedText || "", | ||
| }; | ||
| } catch (error) { | ||
| console.error("Error calling Lamatic flow:", error); | ||
| return { | ||
| success: false, | ||
| result: "Failed to generate preprocessing pipeline. Please try again.", | ||
| }; | ||
| } | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| @tailwind base; | ||
| @tailwind components; | ||
| @tailwind utilities; | ||
|
baggasiddhant marked this conversation as resolved.
|
||
|
|
||
| * { | ||
| box-sizing: border-box; | ||
| margin: 0; | ||
| padding: 0; | ||
| } | ||
|
|
||
| body { | ||
| font-family: 'Inter', sans-serif; | ||
| background-color: #f5f5f5; | ||
| color: #111; | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,29 @@ | ||
| import type { Metadata } from "next"; | ||
| import { ThemeProvider } from "@/components/ThemeProvider" | ||
| import "./globals.css"; | ||
|
|
||
| export const metadata: Metadata = { | ||
| title: "Time-Series Preprocessor — Lamatic AgentKit", | ||
| description: "AI-powered time-series preprocessing pipeline generator", | ||
| }; | ||
|
|
||
| export default function RootLayout({ | ||
| children, | ||
| }: { | ||
| children: React.ReactNode; | ||
| }) { | ||
| return ( | ||
| <html lang="en"> | ||
| <head> | ||
| <link rel="preconnect" href="https://fonts.googleapis.com" /> | ||
| <link rel="preconnect" href="https://fonts.gstatic.com" crossOrigin="anonymous" /> | ||
| <link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;500;600;700;800;900&display=swap" rel="stylesheet" /> | ||
|
Comment on lines
+18
to
+20
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win Consider switching to The three manual ♻️ Proposed refactor import type { Metadata } from "next";
import { ThemeProvider } from "@/components/ThemeProvider"
+import { Inter } from "next/font/google"
import "./globals.css";
+const inter = Inter({ subsets: ["latin"] })
export default function RootLayout({ children }: { children: React.ReactNode }) {
return (
- <html lang="en">
- <head>
- <link rel="preconnect" href="https://fonts.googleapis.com" />
- <link rel="preconnect" href="https://fonts.gstatic.com" crossOrigin="anonymous" />
- <link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;500;600;700;800;900&display=swap" rel="stylesheet" />
- </head>
- <body style={{ margin: 0, padding: 0 }}>
+ <html lang="en" className={inter.className}>
+ <body style={{ margin: 0, padding: 0 }}>
<ThemeProvider>{children}</ThemeProvider>
</body>
</html>
);
}🤖 Prompt for AI Agents |
||
| </head> | ||
| <body style={{ margin: 0, padding: 0 }}> | ||
| <ThemeProvider> | ||
| {children} | ||
| </ThemeProvider> | ||
| </body> | ||
| </html> | ||
| ); | ||
| } | ||
Uh oh!
There was an error while loading. Please reload this page.