Wordbuddy is a lightweight Python package for automating core tasks in a newspaper's production cycle. It was developed for the student newsroom of The Oberlin Review. Core features include:
- Google Docs to HTML conversion
- Google Docs to Wordpress export
- IDML to HTML conversion
- IDML to Wordpress export
Additional features, including an automated copy editor, are in progress.
Wordbuddy uses the OpenAI API in the implementation of some features, including IDML parsing. Generative AI workflows can make mistakes. Check all output before publishing.
To install wordbuddy, navigate to your project home directory and run:
git clone https://github.com/declanrjb/word-buddyThen, create a Python script in your project home directory and run:
import wordbuddy as wbFor full functionality, you will need API keys for OPENAI and Postmark. Store the keys in a tokens.env file in your project root directory under the following keys:
- OPENAI_API_KEY: How to get an OpenAI API key
- POSTMARK_API_KEY: How to get a Postmark API key
The GitHub pages site attached to this repository was generated programmatically using the following code.
Automated web export of a draft news article written in Google Docs, the default editing environment of The Review. Note that HTML (including Flourish visualization) can be included in the document as plain text and is succesfully rendered inline in the export.
wb.docs_to_html('https://docs.google.com/document/d/1Idl2v1pN9EPEyIJjLi3oH2J49U9320q-Ur51ZIb9CRI/edit?usp=sharing', build_dir='demo/site')Please visit this URL to authorize this application: https://accounts.google.com/o/oauth2/auth?response_type=code&client_id=21087935537-b984u5ds9qhssd4q40lo7umtsibti9he.apps.googleusercontent.com&redirect_uri=http%3A%2F%2Flocalhost%3A61581%2F&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive.metadata.readonly+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdocuments&state=b4cZY58S2j0GmLM6Crjeq4RH9078Wt&access_type=offline
'Results stored in ./demo/site/Wordbuddy_demo_SNAP.zip \nResults contain document in web format as index.html and associated assets in /assets'
Automated web export of a demo document with images and text formatting.
wb.docs_to_html('https://docs.google.com/document/d/1Fc6SIuXvXVnEUVeVMrKqjNFfvR1FIAIU6xho0R_aPBs/edit?usp=sharing', build_dir='demo/site')Please visit this URL to authorize this application: https://accounts.google.com/o/oauth2/auth?response_type=code&client_id=21087935537-b984u5ds9qhssd4q40lo7umtsibti9he.apps.googleusercontent.com&redirect_uri=http%3A%2F%2Flocalhost%3A63782%2F&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive.metadata.readonly+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdocuments&state=Cs7t1T0Wg6UsbUZhzuzDwlWWdvVunf&access_type=offline
'Results stored in ./demo/site/Wordbuddy_Formatting_Demo.zip \nResults contain document in web format as index.html and associated assets in /assets'
Automated web export of the March 14, 2025 issue of The Oberlin Review.
wb.idml_to_html('demo/in_design/review_draft', output_dir='demo/site')['demo/site/federal_funding_free.zip',
'demo/site/board_of_trustees_vo.zip',
'demo/site/city_of_oberlin_laun.zip',
'demo/site/students_react_to_wo.zip',
'demo/site/mercy_health_allen_h.zip',
'demo/site/oberlin_hosts_intern.zip',
'demo/site/federal_freezes_caus.zip',
'demo/site/kendal_at_oberlin_fi.zip']
Wordbuddy can read the contents of Google Docs through the Google Drive API and reformat them as HTML documents. Documents are exported as zip files with the following structure:
doc_name
│ index.html
│
└───assets
│ │ img1.png
│ │ img2.jpg
│ │ asset1.pdf
│ │ ...
To format a Google Doc, run:
wb.docs_to_html('https://docs.google.com/document/d/1gDojaWPKDw0mYGh3CVO4PnZjXxfl9ZghrraYMTyVFr8/edit?tab=t.0')With the link to your chosen document.
Wordbuddy can post Google Docs to your Wordpress site using the built-in post by email feature. You will need your site's secret post by email address. Learn how to generate your post by email here.
Once you have your post by email, run:
wb.docs_to_wordpress('<link_to_document>', '<your_post_by_email>')By default, posts are placed in the drafts box rather than publish. Publish posts directly by passing the status='publish' flag.
wb.docs_to_wordpress('<link_to_document>', '<your_post_by_email>', status='publish')Alternatively, you can store your post by email address in a tokens.env under the key WP_POST_EMAIL. Wordbuddy will check this token if no post by email is passed in the function call. An explicitly passed email takes precedence over the environment.
wb.docs_to_wordpress('<link_to_document>')In your InDesign project, go to File > Export and select "InDesign Markup (IDML)". Select your Wordbuddy project directory as the destination. InDesign will generate a folder containing an IDML file and associated assets.
In your Wordbuddy project directory, run:
wb.idml_to_html('<folder_containing_idml>')Wordbuddy will generate a new folder in your project home directory named <folder_containing_idml>_as_html. The folder will contain a series of zip files in the format specified by docs_to_html. Each zip file contains the HTML package for one story in the original InDesign file.
Alternatively, you can pass an explicit build directory:
wb.idml_to_html('<folder_containing_idml>', output_dir='build_here')In your InDesign project, go to File > Export and select "InDesign Markup (IDML)". Select your Wordbuddy project directory as the destination. InDesign will generate a folder containing an IDML file and associated assets.
In your Wordbuddy project directory, run:
wb.idml_to_wordpress('<folder_containing_idml>', '<your post email>')Each story in your InDesign file will be formatted to an HTML document and uploaded to Wordpress site. By default, posts are placed in the drafts box rather than published. You can publish posts directly by passing the status='publish' flag.
wb.idml_to_wordpress('<folder_containing_idml>', '<your post email>', status='publish')As before, if no post email is specified, Wordbuddy will look for a WP_POST_EMAIl value in tokens.env.
wb.idml_to_wordpress('<folder_containing_idml>')