Skip to content

Addition of vehicles.txt to GTFS schedule#636

Draft
doconnoronca wants to merge 32 commits into
google:masterfrom
doconnoronca:vehicles
Draft

Addition of vehicles.txt to GTFS schedule#636
doconnoronca wants to merge 32 commits into
google:masterfrom
doconnoronca:vehicles

Conversation

@doconnoronca
Copy link
Copy Markdown
Contributor

@doconnoronca doconnoronca commented May 11, 2026

Summary

Adds new file vehicles.txt to describe the capacity, accessibility, appearance and features of individual vehicles or vehicle ranges.

Describe the Problem

Based on issue 458. When combined with the vehicle label in GTFS real-time or other real-time APIs, this allows consumers to provide a variety of information about the approaching vehicle to riders.

Use Cases

  • Inform riders about the type of vehicle approaching and if it is the same as the scheduled type of vehicle, such as when streetcars are replaced by buses.
  • Inform riders of the size, color and appearance of the approaching vehicle.
  • Inform consumers about the capacity of vehicles so that a better passenger load estimate can be calculated.
  • Allow analysts to calculate the actual capacity of a route based on vehicle capacity.
  • Inform riders of the availability of features including bike storage, luggage space, restrooms, changing points, WiFi.
  • Provide disabled riders of the accessibility features of the approaching vehicles and how the use them.

Proposed Solution

The addition of a new table called vehicles.txt that describes the vehicles of a transit agency. Also, the addition of vehicle_class columns to routes.txt and trips.txt which allows routes and trips to include the vehicles that typically operate for that route or trips and allow vehicles.txt information to be used if real-time data isn't available.

Type of change

GTFS Schedule

  • Functional Change
  • Non-Functional Change
  • Documentation Maintenance

Additional Information

There is currently a vehicles file in TIDES. This proposal expands on that with additional details that is useful for riders.

The Proposal: Add cyclist position availability fields to VehiclePosition and CarriageDetails includes a total_cyclist_positions value, which wouldn't be needed if the data was available from vehicles.txt.

Some of the examples for the use of Proposal: Additional (Trip) Notices could be handled by vehicles.txt including the availability of WiFi, food and bicycle storage.

I have created a Google Sheet with vehicles data for Toronto TTC, Los Angeles Metro, New York Metro North rail and Toronto GO Train. These can be downloaded as Comma Separated Values to generate a vehicles.txt file. These have been loaded into TransSee.

Some transit enthusiasts who use TransSee have contributed files as well. Transit enthusiasts could crowd source data in places where transit agencies don't generate it on their own. There could be TIDES like extensions to this structure that would include data that is important to them.

BusTimes.org provides some of the same information for bus fleets in the UK.

Proposed Discussion Period

As a large proposal, I suggest at least three months of discussion.

Testing Details

  • Consumer(s): TransSee
  • Producer(s): TransSee has produced some vehicles.txt file based on public data and some transit enthusiasts have contributed some. I would like to see a transit agencies start generating the file.
  • Estimated Testing Period: At least three months.

Proposal Update Tracker

Date Update Description
2026-05-10 Create PR draft

Checklist

@skinkie
Copy link
Copy Markdown
Contributor

skinkie commented May 11, 2026

You are actually introducing Transmodel like functionality with newly invented terms. My suggestion would be to fully align the attribute names with NeTEx.

@etienne0101 etienne0101 added GTFS Schedule Issues and Pull Requests that focus on GTFS Schedule Change type: Functional Refers to modifications that significantly affect specification functionalities. labels May 11, 2026
@kona314
Copy link
Copy Markdown

kona314 commented May 11, 2026

  • TODS defines a vehicles.txt with much more limited information. This would be a breaking conflict for producers attempting to provide both.
    • To resolve this, I'd suggest moving the bulk of the attributes in this file to a vehicle_groups.txt file (or other similar name), then define vehicles.txt as identical to TODS, but with an extra vehicle_group_id column referencing an ID in vehicle_groups.txt.
      • This would also handle the concerns in the next two bullets.
  • If the corresponding RT field is label, should vehicle_id be called vehicle_label instead? Same for vehicle_id_high, which would be vehicle_label_high.
  • Some agencies have letter prefixes/suffixes on their coaches, e.g. B0747 or 1312E. How would this work with defining ranges?
    • Are there libraries to expand e.g. B0741-B0749 and 1301E-1325E into 9 and 25 labels respectively? This is just not a problem I'm familiar with.
  • agency_id should be added and part of the primary key, with the same semantics as routes.agency_id, for handling feeds with multiple agencies.
    • If moving to vehicle_groups.txt, I don't think it would need to be part of the PK, since the group ID column would handle that.
  • Should cab be explicitly forbidden for non-rail vehicles?
  • To me, "standing capacity" is subtly different from "fully loaded capacity," as agencies often differentiate between comfortable standing room and crush loading. Should the description for capacity_standing draw a clearer line here?
    • I think my preference would be to rename to capacity_full and have the description explicitly tie this to the RT OccupancyStatus.FULL value and definition, which it feels like it's already alluding to.
  • Why are AC/USB-A/USB-C plugs defined the way they are? I think it's confusing to have 1 mean an unknown number, but all other numbers mean there are that many. What if there's one plug? (Most likely on small vehicles, e.g. vans or gondolas.)
    • Can -1 be used for an unknown quantity? This is more intuitive to me since you can't have negative plugs.
      • Bikes, wheelchairs, strollers, restrooms/accessible restrooms (maybe?), could also benefit from -1 as unknown quantity.
  • vehicle_type and fuel_type both reference US FTA values. I believe there's been a historic intent in GTFS to avoid endorsing any specific country's systems, which I would agree with.
    • I don't have an alternative proposal here. There are of course the GTFS route types, which are more limited, especially when not counting the extended route types.
  • I like the inclusion of vehicle_class. This almost invites a vehicle_classes.txt, which could define rider-facing values for this info like a name? This would also help the above with overcoming shortcomings of the GTFS route types (e.g. double-decker buses would still be vehicle_route_type = 3, but with a vehicle_class_name = "Double-decker").
  • Buses at my agency announce the route automatically only when the stop is served by multiple routes, otherwise no. Would this be value 2 of route_announcements?
  • Nitpicky naming consistency items:
    • capacity_bike and car_capacity having the word "capacity" switched seems unnecessary.
    • strollers should be capacity_strollers.
    • Some capacity items are plural (strollers) and others are not (bike, car).
    • bicycle_notes should be bike_notes for consistency with capacity_bike and trips.bikes_allowed.

@doconnoronca
Copy link
Copy Markdown
Contributor Author

You are actually introducing Transmodel like functionality with newly invented terms. My suggestion would be to fully align the attribute names with NeTEx.

Based on the NeTEx standard these are the relevant fields.

  • SelfPropelled: Covered by fuel_type
  • TypeOfFuel: Same as fuel_type, but uses a different format. Also GTFS uses underscores between words.
  • EuroClass: Seems to be specific model of bus.
  • LowFloor: Covered by boarding_steps,
  • HasLiftOrRamp: Covered by accessible_boarding enum with more detail,
  • HasHoist: Also covered by accessible_boarding,
  • Length: Matches.
  • Width, Height, Weight: Not included, but could be added if there is interest.

@doconnoronca
Copy link
Copy Markdown
Contributor Author

  • TODS defines a vehicles.txt with much more limited information. This would be a breaking conflict for producers attempting to provide both.

    * To resolve this, I'd suggest moving the bulk of the attributes in this file to a `vehicle_groups.txt` file (or other similar name), then define `vehicles.txt` as identical to TODS, but with an extra `vehicle_group_id` column referencing an ID in `vehicle_groups.txt`.
      
      * This would also handle the concerns in the next two bullets.
    

Yes, I can see the benefit from creating two tables, one based on the TODS vehicles.txt plus a vehicle_group_id and year_manufactured and moving most of the other columns into a vehicle_groups.txt file. However creating two tables makes the structure somewhat more complex.

I could just update vehicles.txt with all the columns to follow TODS instead of TIDES.

* If the corresponding RT field is `label`, should `vehicle_id` be called `vehicle_label` instead? Same for `vehicle_id_high`, which would be `vehicle_label_high`.

* Some agencies have letter prefixes/suffixes on their coaches, e.g. B0747 or 1312E. How would this work with defining ranges?
  
  * Are there libraries to expand e.g. B0741-B0749 and 1301E-1325E into 9 and 25 labels respectively? This is just not a problem I'm familiar with.

In TransSee, for an individual vehicle, it finds the record where the vehicle's label is between the low and high range, so a library to expand the values isn't really needed.

* `agency_id` should be added and part of the primary key, with the same semantics as `routes.agency_id`, for handling feeds with multiple agencies.

This would help for some places with overlapping fleet numbers.

* Should `cab` be explicitly forbidden for non-rail vehicles?

The Niagara Falls People Mover used to have unpowered bus trailers with no cabs.

* To me, "standing capacity" is subtly different from "fully loaded capacity," as agencies often differentiate between comfortable standing room and crush loading. Should the description for `capacity_standing` draw a clearer line here?
  
  * I think my preference would be to rename to `capacity_full` and have the description explicitly tie this to the RT `OccupancyStatus.FULL` value and definition, which it feels like it's already alluding to.

That would probably be better.

* Why are AC/USB-A/USB-C plugs defined the way they are? I think it's confusing to have `1` mean an unknown number, but all other numbers mean there are that many. What if there's one plug? (Most likely on small vehicles, e.g. vans or gondolas.)
  
  * Can `-1` be used for an unknown quantity? This is more intuitive to me since you can't have negative plugs.

    * Bikes, wheelchairs, strollers, restrooms/accessible restrooms (maybe?),  could also benefit from `-1` as unknown quantity.

It's worth considering. Looks like transfer_count uses -1 for unlimited transfers.

* `vehicle_type` and `fuel_type` both reference US FTA values. I believe there's been a historic intent in GTFS to avoid endorsing any specific country's systems, which I would agree with.
  
  * I don't have an alternative proposal here. There are of course the GTFS route types, which are more limited, especially when not counting the extended route types.

* I like the inclusion of `vehicle_class`. This almost invites a `vehicle_classes.txt`, which could define rider-facing values for this info like a name? This would also help the above with overcoming shortcomings of the GTFS route types (e.g. double-decker buses would still be `vehicle_route_type = 3`, but with a `vehicle_class_name = "Double-decker"`).

I think that vehicle_group would be more specific than vehicle_class. A route could be in the double-decker class but have a few different models with different features from different vehicle_groups.

* Buses at my agency announce the route automatically only when the stop is served by multiple routes, otherwise no. Would this be value `2` of `route_announcements`?

Interesting feature.

* Nitpicky naming consistency items:
  
  * `capacity_bike` and `car_capacity` having the word "capacity" switched seems unnecessary.
  * `strollers` should be `capacity_strollers`.
  * Some capacity items are plural (strollers) and others are not (bike, car).
  * `bicycle_notes` should be `bike_notes` for consistency with `capacity_bike` and `trips.bikes_allowed`.

Good suggestions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Change type: Functional Refers to modifications that significantly affect specification functionalities. GTFS Schedule Issues and Pull Requests that focus on GTFS Schedule

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants