2019-03-12 Tue
We generally find two kinds of implementations:
The kernel exposes pci devices in the sysfs:
file | function |
---|---|
config | PCI config space |
enable | Whether the device is enabled |
resource | PCI resource host addresses |
resource0..N | PCI resource N, if present |
resource0_wc..N_wc | PCI WC map resource N, if prefetchable |
/sys/bus/pci/devices/<device_addr>/config
vendor_id
and device_id
int config = pci_open_resource(pci_addr, "config"); uint16_t vendor_id = read_io16(config, 0); uint16_t device_id = read_io16(config, 2); uint32_t class_id = read_io32(config, 8) >> 24; close(config); if (class_id != 2) { error("Device %s is not a NIC", pci_addr); } if (vendor_id == 0x1af4 && device_id >= 0x1000) { return virtio_init(pci_addr, rx_queues, tx_queues); } else { // Our best guess is to try ixgbe return ixgbe_init(pci_addr, rx_queues, tx_queues); }
uint8_t* pci_map_resource(const char* pci_addr) { char path[PATH_MAX]; snprintf(path, PATH_MAX, "/sys/bus/pci/devices/%s/resource0", pci_addr); remove_driver(pci_addr); enable_dma(pci_addr); int fd = check_err(open(path, O_RDWR), "open pci resource"); struct stat stat; check_err(fstat(fd, &stat), "stat pci resource"); return (uint8_t*) check_err(mmap(NULL, stat.st_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0), "mmap pci resource"); }
Actually, it's almost the same thing!
int pci_open_resource(const char* pci_addr, const char* resource) { char path[PATH_MAX]; snprintf(path, PATH_MAX, "/sys/bus/pci/devices/%s/%s", pci_addr, resource); debug("Opening PCI resource at %s", path); int fd = check_err(open(path, O_RDWR), "open pci resource"); return fd; }
How can we get physical addresses ?
/proc/self/pagemap
Not that simple..
We can prevent swaping with mlock(2):
mlock(virt_addr, size);
How about page relocation?
Let's use huge pages!
virtIO is a bit special..
Let's take a look at ixy's packet buffers:
struct pkt_buf { // physical address to pass a buffer to a nic uintptr_t buf_addr_phy; struct mempool* mempool; uint32_t mempool_idx; uint32_t size; uint8_t head_room[SIZE_PKT_BUF_HEADROOM]; uint8_t data[] __attribute__((aligned(64))); };