Prototyping with the Google Coral ML (TPU) Accelerator Module (PCIe)
I previously published a story about my experimentation with the Google Coral TPU Accelerator Module in USB 2.0 mode. This is a follow up story where I experimented with PCIe using the Coral Module breakout board and ribbon wires for the PCIe differential pair (TX, RX, CLK) signals.
Before designing a PCIe X1 PCB, I wanted to experiment if it is possible to prototype a PCIe Gen2 system using protoboards and ribbon wires. I used a PCIe X1 breakout PCB I designed and added some additional boards like a 3.3V to 1.8V step down regulator, an SMD breakout with the TXB0108 logic leveler (This is an 8-channel logic leveler I had on hand, I used only one channel) and the Coral Module breakout.
I wired the following signals from the PCIe slot to the Coral Module:
- PCIe PERST# → TXB0108 B1 → TXB0108 A1 → Coral RST#
- PCIe 3.3V → Coral VIN and TXB0108 VCCB (High Voltage Side)
- PCIe 3.3V → 3.3V to 1.8V step down regulator → Coral AON (1.8V) and TXB0108 VCCA (Low Voltage Side)
- Coral PMIC_EN → Coral AON (1.8V)
- PCIe REFCLK+ → Coral REFCLK+
- PCIe REFCLK- → Coral REFCLK-
- PCIe PETp0 → 100nF 0603 (0402) SMD cap (AC coupling)*not required on most motherboards → Coral PCIE_RX_P
- PCIe PETn0 → 100nF 0603 (0402) SMD cap (AC coupling)*not required on most motherboards → Coral PCIE_RX_N
- PCIe PERp0 ← Coral PCIE_TX_P (AC coupling inside Coral Module TX line)
- PCIe PERn0 ← Coral PCIE_TX_N (AC coupling inside Coral Module TX line)
- PCIe GND — Coral GND
You need to follow the get started guide to install the PCIe drivers and run the examples.
Once the drivers are installed and the coral module is successfully enumerated, you should see the apex device. To verify it is recognized, you can run the following linux commands:
# ls /dev/apex*
/dev/apex_0# lspci -v
...
02:00.0 System peripheral: Global Unichip Corp. Coral Edge TPU (prog-if ff)Subsystem: Global Unichip Corp. Coral Edge TPUFlags: bus master, fast devsel, latency 0, IRQ 17Memory at f0100000 (64-bit, prefetchable) [size=16K]Memory at f0000000 (64-bit, prefetchable) [size=1M]Capabilities: [80] Express Endpoint, MSI 00Capabilities: [d0] MSI-X: Enable+ Count=128 Masked-Capabilities: [e0] MSI: Enable- Count=1/32 Maskable- 64bit+Capabilities: [f8] Power Management version 3Capabilities: [100] Vendor Specific Information: ID=1556 Rev=1 Len=008 <?>Capabilities: [108] Latency Tolerance ReportingCapabilities: [110] L1 PM SubstatesCapabilities: [200] Advanced Error ReportingKernel driver in use: apexKernel modules: apex...
Testing the module in PCIe Mode and comparing the result to USB 3.0 and USB 2.0 HS
PCIe: 22.6ms first run (model load + inference), 3.2ms following tests (inference only)USB 3.0: 12ms first run (model load + inference), 2.3ms following tests (inference only)USB 2.0: 95ms first run (model load + inference), 8.5ms following tests (inference only)
PCIe slower than USB 3.0? I should get similar results to the one published by Google: 11ms first run, 2.8ms following tests (inference only). Similar to USB 3.0. What is wrong with my prototype? Maybe the use of wires or the fact that I am using an older PC (Lenovo m93p)? I found the following input in the system log of my Linux PC:
[ 0.134660] pci 0000:02:00.0: 2.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s x1 link at 0000:00:1c.1 (capable of 4.000 Gb/s with 5 GT/s x1 link)
The bandwidth was limited for some reason. I will test with a custom made PCIe board with the Coral Module soldered directly, without using protoboards connected with wires, and see if I can improve the response time.
Lessons Learned
- Do not forget the AC coupling capacitors on the Coral PCI_RX lines. 100nF 0603 SMD works (0402 recommended). Most motherboards have them installed, so only needed if you are designing your own custom solution.
- MSI-X is required and it is available in PCIe 2.0 specification and up
- Keep wires as short as possible