Skip to content

libmetal: lib: linux: improve Linux UIO-backed device open and test coverage#365

Open
bentheredonethat wants to merge 4 commits into
OpenAMP:mainfrom
bentheredonethat:uio-update
Open

libmetal: lib: linux: improve Linux UIO-backed device open and test coverage#365
bentheredonethat wants to merge 4 commits into
OpenAMP:mainfrom
bentheredonethat:uio-update

Conversation

@bentheredonethat
Copy link
Copy Markdown
Contributor

This series improves the Linux UIO-backed device-open flow in libmetal and
adds test coverage for the new API and Linux-specific helper paths.

The immediate motivation is to support Linux userspace applications that open
UIO-exposed devices through libmetal while keeping the existing bus/device
contract intact. In the current Linux implementation, the basic UIO path is
already present, but the backend is tightly coupled to bus probing, does not
cleanly separate resolved Linux device identities, does not unregister Linux
IRQ device state on close, and does not correctly retain the raw mapping
needed when a UIO map uses a non-zero offset.

This series addresses those gaps in three steps:

1. Add an explicit public helper,                                              
   metal_device_open_from_bus(), for bus/device based open while keeping       
   metal_device_open() as a compatibility wrapper.                             
                                                                               
2. Refactor the Linux UIO backend so it can populate a libmetal device         
   after resolving either a bus device name or a UIO class name, track the  
   resolved Linux identities more clearly, validate UIO map offsets, retain 
   raw mmap pointers for correct unmap, unregister Linux IRQ device state   
   on close, and tolerate Linux bus probe failure during metal_sys_init().  
                                                                               
3. Add cross-platform and Linux-specific tests for the new device-open         
   helper and the Linux UIO/IRQ bookkeeping helpers.                           

With these changes, downstream Linux host applications can continue to use
libmetal's device-open and IRQ registration model while relying on the
improved Linux UIO device handling in the library.

Comment thread lib/system/linux/device.c
Comment thread lib/system/linux/device.c
Comment thread lib/system/linux/irq.h
Comment thread lib/system/linux/sys.h Outdated
void *output, int len);
int metal_linux_uio_validate_offset(const char *dev_name,
unsigned int index,
unsigned long offset);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doxygen documentation header

Comment thread lib/device.c Outdated

int metal_device_open(const char *bus_name, const char *dev_name,
struct metal_device **device)
int metal_device_open_from_bus(const char *bus_name, const char *dev_name,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand the need to create a new API. It looks to me like this introduces confusion by duplicating APIs instead of making things more obvious.

Having a single system-agnostic API makes more sense to me.

@bentheredonethat
Copy link
Copy Markdown
Contributor Author

Hi @arnopo

Thanks for the review. I agree the Linux change is too large as posted, and I can split it into smaller commits before respinning.

The main problem I am trying to solve is narrower than the current series makes it look. Linux userspace libmetal already opens platform/PCI devices through the existing Linux UIO-backed path.
Today applications have to pass the Linux bus device name, for example "ff360000.ipi". That name comes from the platform device/unit-address naming and is not a stable logical name across SoCs or
even across different instances.

For demos and portable userspace applications, we would like to use a logical name such as "demo-ipi". On systems using UIO, that logical name can be exposed by the kernel as the UIO class name,
for example:

  /sys/class/uio/uioX/name = demo-ipi

The Linux libmetal backend can then resolve that UIO class name back to the real bus device:

  demo-ipi
    -> /sys/class/uio/uioX/name
    -> /sys/class/uio/uioX/device
    -> /sys/bus/platform/devices/ff360000.ipi

and continue through the existing Linux bus open/bind/map/IRQ/close flow.

So the intended direction is not to make UIO mandatory for OpenAMP generally, and not to introduce a new Linux device model. It is only to let the existing Linux UIO backend accept the UIO class
name as an alias for the native bus device name.

I also agree with your comment about the new public API. The UIO-name use case does not require adding metal_device_open_from_bus(). I can drop that patch and keep the existing system-agnostic
API:

  metal_device_open("platform", "demo-ipi", &dev);

The Linux backend would interpret the device argument as either:

  1. the native bus device name, preserving existing behavior, or
  2. a UIO class-name alias, resolved internally to the native platform/PCI device.

Internally I plan to keep both identities clear, e.g. requested name / resolved bus device name / UIO name. For minimum behavior change, the returned device->name can remain the resolved native
bus device name; the UIO name is just an open-time alias.

For the respin, does the following direction sound acceptable?

  1. Drop the new metal_device_open_from_bus() public API.
  2. Split the Linux changes into smaller commits, likely:
    • UIO mmap offset validation and correct raw-pointer unmap tracking.
    • IRQ bookkeeping cleanup on device close.
    • UIO class-name alias resolution for the existing Linux UIO backend.
    • Tests and documentation updates.
  3. Include Doxygen comments in the same commits that introduce new internal Linux helpers.
  4. Clarify in the commit messages that UIO-name lookup is optional alias resolution for the existing Linux backend, not a requirement for OpenAMP Linux userspace in general.

If this direction makes sense, I will rework the series around that.

@arnopo
Copy link
Copy Markdown
Contributor

arnopo commented May 12, 2026

Hi @arnopo

Thanks for the review. I agree the Linux change is too large as posted, and I can split it into smaller commits before respinning.

The main problem I am trying to solve is narrower than the current series makes it look. Linux userspace libmetal already opens platform/PCI devices through the existing Linux UIO-backed path. Today applications have to pass the Linux bus device name, for example "ff360000.ipi". That name comes from the platform device/unit-address naming and is not a stable logical name across SoCs or even across different instances.

For demos and portable userspace applications, we would like to use a logical name such as "demo-ipi". On systems using UIO, that logical name can be exposed by the kernel as the UIO class name, for example:

  /sys/class/uio/uioX/name = demo-ipi

The Linux libmetal backend can then resolve that UIO class name back to the real bus device:

  demo-ipi
    -> /sys/class/uio/uioX/name
    -> /sys/class/uio/uioX/device
    -> /sys/bus/platform/devices/ff360000.ipi

and continue through the existing Linux bus open/bind/map/IRQ/close flow.

What about using symbolic link for that, as proposed by @tnmysh in OpenAMP/openamp-system-reference#101.
That would avoid resolution by /sys/class/uio/uioX/name if a /sys/class/uio/<name>/device symbolic is created with a udev rule.
Would it work in your case?

So the intended direction is not to make UIO mandatory for OpenAMP generally, and not to introduce a new Linux device model. It is only to let the existing Linux UIO backend accept the UIO class name as an alias for the native bus device name.

I also agree with your comment about the new public API. The UIO-name use case does not require adding metal_device_open_from_bus(). I can drop that patch and keep the existing system-agnostic API:

  metal_device_open("platform", "demo-ipi", &dev);

The Linux backend would interpret the device argument as either:

  1. the native bus device name, preserving existing behavior, or
  2. a UIO class-name alias, resolved internally to the native platform/PCI device.

Internally I plan to keep both identities clear, e.g. requested name / resolved bus device name / UIO name. For minimum behavior change, the returned device->name can remain the resolved native bus device name; the UIO name is just an open-time alias.

For the respin, does the following direction sound acceptable?

  1. Drop the new metal_device_open_from_bus() public API.

  2. Split the Linux changes into smaller commits, likely:

    • UIO mmap offset validation and correct raw-pointer unmap tracking.
    • IRQ bookkeeping cleanup on device close.
    • UIO class-name alias resolution for the existing Linux UIO backend.
    • Tests and documentation updates.
  3. Include Doxygen comments in the same commits that introduce new internal Linux helpers.

  4. Clarify in the commit messages that UIO-name lookup is optional alias resolution for the existing Linux backend, not a requirement for OpenAMP Linux userspace in general.

Sound good.

Thanks
arnaud

If this direction makes sense, I will rework the series around that.

@bentheredonethat
Copy link
Copy Markdown
Contributor Author

Thanks @arnopo, this is a good point.

Using a udev-created symlink such as:

/sys/class/uio/<logical-name>/device
would work in our use case and is a nice optimization when present, since it avoids scanning uioX/name.

For upstream libmetal, I would prefer to treat this as optional platform integration rather than a hard dependency, because not all deployments guarantee custom udev rules. So my plan is:

  1. keep the generic fallback that resolves via /sys/class/uio/uioX/name,
  2. optionally try the symlink path first when it exists.

That keeps behavior portable out of the box while allowing integrators to use the symlink approach for faster/cleaner lookup.

If you agree, I will document this in the commit message as “optional udev optimization, generic fallback preserved”.

@arnopo
Copy link
Copy Markdown
Contributor

arnopo commented May 12, 2026

Thanks @arnopo, this is a good point.

Using a udev-created symlink such as:

/sys/class/uio/<logical-name>/device would work in our use case and is a nice optimization when present, since it avoids scanning uioX/name.

For upstream libmetal, I would prefer to treat this as optional platform integration rather than a hard dependency, because not all deployments guarantee custom udev rules. So my plan is:

  1. keep the generic fallback that resolves via /sys/class/uio/uioX/name,
  2. optionally try the symlink path first when it exists.

That keeps behavior portable out of the box while allowing integrators to use the symlink approach for faster/cleaner lookup.

If you agree, I will document this in the commit message as “optional udev optimization, generic fallback preserved”.

I would prefer that we handle this in the same way we manage /dev/rpmsgX or /sys/class/remoteproc/remoteprocX devices, rather than adding it to libmetal.

I propose adding this PR to the agenda for the next OpenAMP meeting so we can discuss it further.

@bentheredonethat
Copy link
Copy Markdown
Contributor Author

Hi @arnopo @wmamills @tnmysh i have some updates here:

Update: I pushed the revised libmetal changes to this PR.

The current branch now fleshes out the Linux uio bus support while preserving the pre-existing platform/device open flow. Existing users that open devices through the platform bus continue to use the same bind-and-open path. The new uio bus path is additive and lets userspace open already exposed UIO devices by their /sys/class/uio/uioX/name value, which gives applications a stable logical lookup path without requiring generated platform device names.

I also kept the fallback behavior in place: native platform-bus open remains the primary path for existing users, and the UIO class-name path is only used when callers explicitly request the uio bus. The shared UIO populate logic is reused so both paths get the same mmap offset handling, IRQ setup, DMA-map behavior, and cleanup.

I investigated the symlink-based lookup option as requested. I do not think symlinks are needed for this PR. The kernel already exposes the stable lookup key we need through /sys/class/uio/uioX/name, and resolving that directly avoids adding another filesystem convention that would need to be created, documented, kept in sync with UIO enumeration, and handled across distros/init systems/containers. Using the existing UIO class metadata keeps the implementation self-contained in libmetal and avoids requiring deployment-side symlink management.

So the PR now takes this approach:

  • Add explicit uio bus support for opening devices by UIO class name.
  • Preserve the existing platform bus behavior and fallback path.
  • Reuse the same UIO populate/mapping/IRQ cleanup logic for both paths.
  • Avoid symlink lookup because /sys/class/uio/uioX/name is sufficient and already available.

Copy link
Copy Markdown
Contributor

@arnopo arnopo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bentheredonethat
Sorry for the delay, please find some comment.
I need to past time on "lib: linux: add UIO bus open by class name" to understand your work. adding more comment would help me

Comment thread lib/system/linux/device.c Outdated
Comment thread lib/system/linux/device.c Outdated
unsigned long offset,
metal_phys_addr_t *phys,
size_t *map_len,
size_t *region_size)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Define structures to avoid passing too many parameters.

Comment thread lib/system/linux/sys.h Outdated
Comment thread lib/system/linux/irq.c Outdated
irqs[irq].arg = NULL;
metal_mutex_release(&irq_lock);

metal_linux_irq_notify();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code above should not be done here you should only have a check

if (metal_linux_irq_is_enabled(irq)
return EINVAL;

=> the standard metal_irq_disable should be called before by the IRQ consumer

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok will fix.

metal_linux_irq_unregister_dev() will only detach the Linux fd-to-device bookkeeping after the consumer has disabled the IRQ.

Comment thread lib/system/linux/device.c Outdated
Comment thread lib/system/linux/device.c Outdated

dir = opendir(METAL_UIO_CLASS_PATH);
if (!dir)
return -errno;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems not standard error

Copy link
Copy Markdown
Contributor Author

@bentheredonethat bentheredonethat Jun 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed by mapping missing /sys/class/uio to -ENODEV, while preserving other opendir() failures as -errno. This keeps “UIO class unavailable” consistent with the rest of Linux bus probing.

Comment thread lib/system/linux/device.c
Comment thread lib/system/linux/sys.h Outdated
Comment thread lib/system/linux/device.c Outdated
}

int metal_linux_uio_validate_offset(const char *dev_name,
unsigned int index,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

index seems useless as only use to print the error message,

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will remove

Comment thread lib/system/linux/device.c Outdated
error = ldrv->dev_open(lbus, ldev);
if (error) {
if (open_error == -ENODEV || error != -ENODEV)
open_error = error;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems to me that it is not the good strategy .

  1. if ldrv->dev_open return an error, the device is probably not opened so no need to close it
  2. if the open fails, we should keep the error and close all devices aready opened
for_each_linux_driver(lbus, ldrv) {

		/* Check if we have a viable driver. */
		if (!ldrv->sdrv || !ldrv->dev_open)
			continue;

		/* Reset device data. */
		memset(ldev, 0, sizeof(*ldev));
		strncpy(ldev->dev_name, dev_name, sizeof(ldev->dev_name) - 1);
		ldev->fd = -1;
		ldev->ldrv = ldrv;
		ldev->device.bus = bus;

		/* Try and open the device. */
		error = ldrv->dev_open(lbus, ldev);
		if (error) {
			goto close_dev;
		}

		*device = &ldev->device;
		(*device)->name = ldev->dev_name;

		metal_list_add_tail(&bus->devices, &(*device)->node);
		return 0;
	}

close_dev:
    for_each_linux_driver(lbus, ldrv) {
		ldev->ldrv = ldrv;
		ldrv->dev_close(lbus, ldev);
        metal_list_del(&ldev->device.node);
	}
	free(ldev);

	return error;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok will do this - thanks

The Linux bus open path may try more than one backend driver for a
device. When a backend finds the device but fails while opening it,
the common open loop currently discards that errno and returns
-ENODEV after all drivers have been tried.

Keep the first useful backend open error, preferring non-ENODEV
failures over a plain miss. This preserves the existing not-found
result while letting callers see real failures such as UIO map
population errors.

Signed-off-by: Ben Levinsky <ben.levinsky@amd.com>
UIO map offsets identify the usable resource start inside the
page-aligned mapping exposed by sysfs. The Linux backend previously
exposed and unmapped the adjusted virtual address directly.

Keep the raw mmap base and length for close, expose the usable
virtual address as raw mapping plus offset, and derive the libmetal
physical base and size from the usable portion of the UIO map.

Use the sysfs map size as the mmap length. For an unaligned resource,
UIO already reports a page-aligned address and a full mmap length, so
adding the offset to that length can over-map the resource and fail.

Reject offsets outside the system page size, reject offsets beyond the
map size, and report overflow before attempting to mmap the region.

Signed-off-by: Ben Levinsky <ben.levinsky@amd.com>
A UIO-backed device registers its file descriptor with the Linux IRQ
controller so interrupt handling can find the owning metal device.
Closing the device must clear that association before closing the fd.

Add an internal unregister helper that detaches the device pointer
after the IRQ consumer has disabled the IRQ. Keep IRQ handler and
enable-state teardown owned by the standard IRQ disable and unregister
paths.

Signed-off-by: Ben Levinsky <ben.levinsky@amd.com>
Keep metal_device_open() as the single public device-open API and add
a Linux "uio" bus that opens devices by matching dev_name against
/sys/class/uio/uioX/name.

The uio bus path treats the requested UIO name as the libmetal-visible
device identity, so device->bus->name remains "uio" and device->name
remains the requested name. Existing platform and pci bus opens
continue to use native bus device names and the existing UIO bind path.

Share the UIO populate flow so both native bus opens and class-name
opens use the same mmap, IRQ, DMA, and close handling.

Signed-off-by: Ben Levinsky <ben.levinsky@amd.com>
Copy link
Copy Markdown
Contributor

@arnopo arnopo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the commits are still not easy to review. Reverse engineering is required to understand it.
Adding more details on the algorithm you try to apply in the commit message could help

Comment thread lib/system/linux/device.c
void *raw, *virt;
int irq_info;

i = 0;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be initialized when declared

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will fix

Comment thread lib/system/linux/device.c
phys = &ldev->region_phys[ldev->device.num_regions];
result = metal_uio_read_map_attr(ldev, i, "offset", &offset);
if (result)
break;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you break for this one?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will add comment

Comment thread lib/system/linux/device.c
}
result = metal_open(ldev->dev_path, 0);
if (result < 0) {
metal_log(METAL_LOG_ERROR, "failed to open device %s\n",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
metal_log(METAL_LOG_ERROR, "failed to open device %s\n",
metal_log(METAL_LOG_ERROR, "failed to open device %s: %s\n",

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will fix

Comment thread lib/system/linux/device.c
/*
* /sys/class/uio is a class, not a bus. Register the synthetic bus only
* when the UIO class exists and skip normal bus/driver probing.
*/
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment not clear to me . the sentence "/sys/class/uio is a class, not a bus" is confusing.
suggestion

	/*
	 * Register the synthetic bus only  when the  /sys/class/uio 
	 *class exists and skip normal bus/driver probing.
	 */

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will fix

Comment thread lib/system/linux/device.c
phys = &ldev->region_phys[ldev->device.num_regions];
result = metal_uio_read_map_attr(ldev, i, "offset", &offset);
if (result)
break;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error case to manage to properly leave this function here and below , closing /unregistering, freeing, ... thinks

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will fix

Comment thread lib/system/linux/device.c
*newline = '\0';

return 0;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion ( can also be applyed for metal_uio_read_str_attr())

static int metal_linux_read_first_line(const char *path, char *output,
				       size_t output_len)
{
	FILE *fp;
	char *newline;
    err = 0;

	if (!path || !output || output_len < 2)
		return -EINVAL;

	fp = fopen(path, "r");
	if (!fp)
		return -errno;

	if (!fgets(output, output_len, fp)) {
		int err = ferror(fp) ? -errno : -ENODATA;

		goto close_file;
	}

	newline = strchr(output, '\n');
	if (newline)
		*newline = '\0';

close_file:
	fclose(fp);

	return err;
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok will fix

Comment thread lib/system/linux/device.c
}
found = true;

result = snprintf(ldev->cls_path, sizeof(ldev->cls_path),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not obvious what is cls_path , pleas clarify by comments or by describing ldev fields

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will fix

Comment thread lib/system/linux/device.c

result = metal_uio_dev_bind(ldev, ldrv);
if (result)
return result;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error case to manage to free resources here and below

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants