Skip to content

declanrjb/word-buddy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Wordbuddy is a lightweight Python package for automating core tasks in a newspaper's production cycle. It was developed for the student newsroom of The Oberlin Review. Core features include:

  • Google Docs to HTML conversion
  • Google Docs to Wordpress export
  • IDML to HTML conversion
  • IDML to Wordpress export

Additional features, including an automated copy editor, are in progress.

Wordbuddy uses the OpenAI API in the implementation of some features, including IDML parsing. Generative AI workflows can make mistakes. Check all output before publishing.

Quickstart

To install wordbuddy, navigate to your project home directory and run:

git clone https://github.com/declanrjb/word-buddy

Then, create a Python script in your project home directory and run:

import wordbuddy as wb

For full functionality, you will need API keys for OPENAI and Postmark. Store the keys in a tokens.env file in your project root directory under the following keys:

Worked Examples

The GitHub pages site attached to this repository was generated programmatically using the following code.

Google Doc to Microsite

Automated web export of a draft news article written in Google Docs, the default editing environment of The Review. Note that HTML (including Flourish visualization) can be included in the document as plain text and is succesfully rendered inline in the export.

wb.docs_to_html('https://docs.google.com/document/d/1Idl2v1pN9EPEyIJjLi3oH2J49U9320q-Ur51ZIb9CRI/edit?usp=sharing', build_dir='demo/site')
Please visit this URL to authorize this application: https://accounts.google.com/o/oauth2/auth?response_type=code&client_id=21087935537-b984u5ds9qhssd4q40lo7umtsibti9he.apps.googleusercontent.com&redirect_uri=http%3A%2F%2Flocalhost%3A61581%2F&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive.metadata.readonly+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdocuments&state=b4cZY58S2j0GmLM6Crjeq4RH9078Wt&access_type=offline





'Results stored in ./demo/site/Wordbuddy_demo_SNAP.zip \nResults contain document in web format as index.html and associated assets in /assets'

Google Doc with Images, Lists, and Italics to Microsite

Automated web export of a demo document with images and text formatting.

wb.docs_to_html('https://docs.google.com/document/d/1Fc6SIuXvXVnEUVeVMrKqjNFfvR1FIAIU6xho0R_aPBs/edit?usp=sharing', build_dir='demo/site')
Please visit this URL to authorize this application: https://accounts.google.com/o/oauth2/auth?response_type=code&client_id=21087935537-b984u5ds9qhssd4q40lo7umtsibti9he.apps.googleusercontent.com&redirect_uri=http%3A%2F%2Flocalhost%3A63782%2F&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive.metadata.readonly+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdocuments&state=Cs7t1T0Wg6UsbUZhzuzDwlWWdvVunf&access_type=offline





'Results stored in ./demo/site/Wordbuddy_Formatting_Demo.zip \nResults contain document in web format as index.html and associated assets in /assets'

News articles from an InDesign file

Automated web export of the March 14, 2025 issue of The Oberlin Review.

wb.idml_to_html('demo/in_design/review_draft', output_dir='demo/site')
['demo/site/federal_funding_free.zip',
 'demo/site/board_of_trustees_vo.zip',
 'demo/site/city_of_oberlin_laun.zip',
 'demo/site/students_react_to_wo.zip',
 'demo/site/mercy_health_allen_h.zip',
 'demo/site/oberlin_hosts_intern.zip',
 'demo/site/federal_freezes_caus.zip',
 'demo/site/kendal_at_oberlin_fi.zip']

Core Functions

Google Docs to HTML

Wordbuddy can read the contents of Google Docs through the Google Drive API and reformat them as HTML documents. Documents are exported as zip files with the following structure:

doc_name
│   index.html 
│
└───assets
│   │   img1.png
│   │   img2.jpg
│   │   asset1.pdf
│   │   ...

To format a Google Doc, run:

wb.docs_to_html('https://docs.google.com/document/d/1gDojaWPKDw0mYGh3CVO4PnZjXxfl9ZghrraYMTyVFr8/edit?tab=t.0')

With the link to your chosen document.

Google Docs to Wordpress

Wordbuddy can post Google Docs to your Wordpress site using the built-in post by email feature. You will need your site's secret post by email address. Learn how to generate your post by email here.

Once you have your post by email, run:

wb.docs_to_wordpress('<link_to_document>', '<your_post_by_email>')

By default, posts are placed in the drafts box rather than publish. Publish posts directly by passing the status='publish' flag.

wb.docs_to_wordpress('<link_to_document>', '<your_post_by_email>', status='publish')

Alternatively, you can store your post by email address in a tokens.env under the key WP_POST_EMAIL. Wordbuddy will check this token if no post by email is passed in the function call. An explicitly passed email takes precedence over the environment.

wb.docs_to_wordpress('<link_to_document>')

InDesign to HTML

In your InDesign project, go to File > Export and select "InDesign Markup (IDML)". Select your Wordbuddy project directory as the destination. InDesign will generate a folder containing an IDML file and associated assets.

In your Wordbuddy project directory, run:

wb.idml_to_html('<folder_containing_idml>')

Wordbuddy will generate a new folder in your project home directory named <folder_containing_idml>_as_html. The folder will contain a series of zip files in the format specified by docs_to_html. Each zip file contains the HTML package for one story in the original InDesign file.

Alternatively, you can pass an explicit build directory:

wb.idml_to_html('<folder_containing_idml>', output_dir='build_here')

InDesign to Wordpress

In your InDesign project, go to File > Export and select "InDesign Markup (IDML)". Select your Wordbuddy project directory as the destination. InDesign will generate a folder containing an IDML file and associated assets.

In your Wordbuddy project directory, run:

wb.idml_to_wordpress('<folder_containing_idml>', '<your post email>')

Each story in your InDesign file will be formatted to an HTML document and uploaded to Wordpress site. By default, posts are placed in the drafts box rather than published. You can publish posts directly by passing the status='publish' flag.

wb.idml_to_wordpress('<folder_containing_idml>', '<your post email>', status='publish')

As before, if no post email is specified, Wordbuddy will look for a WP_POST_EMAIl value in tokens.env.

wb.idml_to_wordpress('<folder_containing_idml>')

About

Lightweight Python package for automating core tasks in a newspaper's production cycle

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages