[SOLVED] DPDK 20.11 – IPv4 Fragmentation – indirect pool gets exhausted

Issue

I m trying to fragmentation an IPv4 packet using the below logic:

~after pkts ingress~

struct rte_port_ring_writer *p = port_out->h_port;

pool_direct = rte_mempool_lookup("MEMPOOL0");  
pool_indirect = rte_mempool_lookup("MEMPOOL1");

printf("before frag mempool size d %d in %d\n",rte_mempool_avail_count(pool_direct),rte_mempool_avail_count(pool_indirect));

struct rte_mbuf *frag_pkts[MAX_FRAG_SIZE];  
int out_pkts =    rte_ipv4_fragment_packet (m, frag_pkts, n_frags, ip_mtu,pool_direct, pool_indirect);

printf("after frag mempool size d %d in %d\n",rte_mempool_avail_count(pool_direct),rte_mempool_avail_count(pool_indirect));

if(out_pkts > 0)  
port_out->ops.f_tx_bulk(port_out->h_port,frag_pkts,RTE_LEN2MASK(out_pkts, uint64_t));  
else  
printf("frag failed\n");

rte_pktmbuf_free>(m);                       //free parent pkt

Now the problem here is the indirect mempool gets exhausted. As a result after few burst of packets the fragmentation fails due to -ENOMEM. I quite cannot understand why the PMD doesn’t free and put back the mempool obj back to MEMPOOL1. Is it highly unlikely it is because of NIC ports being bounded to MEMPOOL0 and frag pkts from MEMPOOL1 being egressed.

Please find the log below for the above snippet which prints the available slots in direct (d) and indirect (in) mempools:

before frag mempool size d 2060457 in 2095988
after frag mempool size d 2060344 in 2095952
before frag mempool size d 2060361 in 2095945
after frag mempool size d 2060215 in 2095913
.
.
.
before frag mempool size d 2045013 in 0
after frag mempool size d 2045013 in 0
before frag mempool size d 2045013 in 0
after frag mempool size d 2045013 in 0
before frag mempool size d 2045013 in 0

I can see the direct mempool reduce and increase as packets ingress and drop/egress as expected. I can also confirm I receive the initial burst of fragmented packets equal to MEMPOOL1 size. Any inputs towards understanding the cause of the problem is much appreciated.

P.S: We had the same problem in dpdk17.11. We had to refractor the rte_ipv4_fragment_packet() to not use indirect chaining of frags instead just generate them.

Edit:
DPDK version – 20.11
Env – linux – centos 7
PMD – i40e – using bond in mode4
Pkt size – 128
MTU – 70

Mempools are created with rte_pktmbuf_pool_create(),
thus with no SC/SP flag (defaults to MC/MP). Also always n_frags < MAX_FRAG_SIZE.

Thanks & Regards,
Vishal Mohan

Solution

DPDK API rte_ipv4_fragment_packet is used under both testpmd and DPDK example ip_fragementation. These are also included under the DPDK test suite which is run for each release too.

Based on the internal test and proper use of API for example Ip_fragementation the issue is not been able to reproduce. Hence the API leaking memory pool buffers are highly unlikely other than some special corner case (which is yet to be found).

Based on the code snippet analysis following could be the cause of mempool exhaust

  1. fail to free direct buffer after fragmentation
  2. fail to free one or more fragments from the indirect buffer when tx_burst fails.

[EDIT-1] based on the email update and comments, there is indeed no problem with DPDK API rte_ipv4_fragment_packet. The issue is from Application logic, with new behaviour as

  1. DPDK BOND PMD leads to mempool exhaustion with current code snippet
  2. DPDK BOND PMD has no issues with DPDK example with rte_ipv4_fragment_packet
  3. DPDK i40e PMD has an issue with the current code snippet
  4. DPDK i40e PMD has no issue with dpdk example rte_ipv4_fragment_packet

Hence the issue is with sample code snippet and usage and not DPDK API.

Answered By – Vipin Varghese

Answer Checked By – Candace Johnson (BugsFixing Volunteer)

Leave a Reply

Your email address will not be published.