FortiGate Performance Optimization: A Tuning Guide for Throughput

Overview

FortiGate firewalls rely on a multi-processor architecture that separates traffic processing across specialized hardware. When properly tuned, a FortiGate can deliver wire-speed throughput with full security inspection. When misconfigured, the same device can bottleneck at a fraction of its rated capacity -- dropping packets, triggering conserve mode, and degrading the user experience.

The performance gap is real. A FortiGate 600F is rated at 78 Gbps firewall throughput and 9.5 Gbps with full NGFW (IPS + application control + malware protection). That is an 8x difference between hardware-offloaded and CPU-processed traffic. Every session that cannot be offloaded to the NP (Network Processor) falls back to the CPU, consuming resources that compound under load. In high-throughput environments running 10+ Gbps with UTM enabled, the difference between an optimized and unoptimized FortiGate is the difference between stable operation and conserve mode at 2 AM.

FortiGate ASIC Architecture

FortiGate devices use a purpose-built Security Processing Unit (SPU) architecture with three key components:

Processor	Full Name	Function	Examples
NP	Network Processor	Hardware-accelerated packet forwarding, NAT, IPsec, VXLAN encap/decap	NP6, NP6XLite, NP7
CP	Content Processor	SSL/TLS offload, IPS pattern matching, encryption/decryption	CP9
SP/CPU	Security Processor / Main CPU	Flow-based and proxy-based UTM inspection, management plane, logging	ARM/x86 cores

The optimization goal is straightforward: maximize the percentage of traffic processed by NP and CP hardware, and minimize what falls to the CPU.

Traffic Flow Through FortiGate

Traffic follows one of three paths:

Fast Path (NP Offload) -- After the first few packets of a session are inspected by the CPU and a policy match is found, the session is offloaded to the NP for hardware-accelerated forwarding. This is the highest-performance path. No CPU involvement after offload.
Kernel Path (Software Fast Path) -- Sessions that cannot be NP-offloaded but do not require proxy-mode inspection. The kernel handles forwarding in software. Performance is good but not wire-speed.
Proxy Path (CPU) -- Sessions requiring deep content inspection (proxy-mode antivirus, explicit web proxy, SSL deep inspection without CP offload). Every packet passes through the CPU. This is the slowest path and the primary bottleneck under load.

What This Guide Covers:

Area	Optimization Target
NP Offload	Maximize hardware-accelerated session count
CP Offload	Offload SSL and IPS to content processors
Session Table	Right-size table, tune TTLs, reduce waste
UTM Profiles	Balance security depth with throughput
Firewall Policies	Reduce lookup time, optimize ordering
SD-WAN	Tune probes, thresholds, load balancing
Memory	Prevent conserve mode, tune thresholds
Interfaces	MTU, LACP, ECMP, flow control
Logging	Reduce logging overhead without losing visibility

Warning: Performance tuning changes can affect security inspection depth and traffic flow behavior. Always benchmark before and after each change. Test in a lab or maintenance window first. Document every change for audit and rollback purposes.

Scenario

This guide addresses four common performance challenges:

1. High-Throughput Data Center / Campus Edge -- You are running a FortiGate at a data center edge or campus aggregation point handling 10+ Gbps sustained throughput. You need every session offloaded to hardware where possible, and UTM profiles tuned for flow-mode performance without sacrificing meaningful security coverage.

2. UTM Performance Degradation -- You have enabled IPS, antivirus, web filtering, application control, and SSL deep inspection on the same policy. CPU utilization climbs above 80% during business hours. Throughput degrades. Users report slow page loads and timeouts. You need to identify which UTM features are consuming the most CPU and optimize or restructure them.

3. Conserve Mode Incidents -- Your FortiGate has entered conserve mode one or more times, dropping new sessions and degrading existing ones. You need to understand why memory was exhausted, tune thresholds, and prevent recurrence.

4. SD-WAN Performance Optimization -- You are running SD-WAN with multiple WAN links and need to optimize path selection, minimize probe overhead, tune load balancing algorithms, and ensure SLA-based failover happens quickly without causing flaps.

Regardless of scenario, the methodology is the same: baseline current performance, identify bottlenecks, apply targeted optimizations, verify improvements, and document changes.

Pre-Optimization Assessment

Before making any changes, capture comprehensive baseline metrics. You need to know where you are to measure improvement.

System-Level Metrics

# Overall system status -- CPU, memory, firmware, uptime
get system status
 
# Real-time CPU usage per core
get system performance status
 
# Detailed CPU usage breakdown (user, system, idle, interrupt)
diagnose sys top 1 20
 
# Memory usage summary
diagnose hardware sysinfo memory
 
# Current session count vs maximum
diagnose sys session stat
 
# Full session table statistics
diagnose sys session full-stat

NP (Network Processor) Counters

# NP6/NP7 session offload statistics
diagnose npu np6 session-stats
# For NP7-based platforms:
diagnose npu np7 session-stats
 
# NP hardware session count
diagnose npu np6 sse-stats
 
# NP packet counters (drops, errors, throughput)
diagnose npu np6 port-list
 
# Check NP fast-path utilization
diagnose npu np6 dce
 
# Identify sessions NOT offloaded (and why)
diagnose npu np6 sse-stats | grep offload

Interface Throughput

# Real-time interface throughput (run for 10+ seconds to get stable readings)
diagnose netlink interface list
 
# Detailed interface counters -- packets, bytes, errors, drops
fnsysctl ifconfig <interface-name>
 
# Per-VDOM interface stats (if VDOMs enabled)
diagnose netlink interface list | grep -A5 <interface>

Content Processor Metrics

# CP utilization percentage
diagnose sys cps stats
 
# SSL offload statistics
diagnose vpn ssl stats
 
# IPS engine status and packet counts
diagnose ips session status

Record Your Baseline

Create a baseline document with these values:

Metric	Command	Your Baseline Value
CPU Usage (average)	`get system performance status`	___%
Memory Usage	`get system performance status`	___%
Active Sessions	`diagnose sys session stat`	___
Max Sessions	`diagnose sys session stat`	___
NP Offloaded Sessions	`diagnose npu np6 session-stats`	___%
NP Drops	`diagnose npu np6 sse-stats`	___
Interface Throughput (peak)	`diagnose netlink interface list`	___ Gbps
CP Utilization	`diagnose sys cps stats`	___%

Tip: Run baseline collection during peak traffic hours, not at 3 AM. Performance tuning is meaningless if you baseline during idle periods. Collect data for at least 30 minutes to capture representative load patterns.

Step 1: Understanding the Traffic Flow

Before optimizing, you must understand which path your traffic is actually taking. Every session on a FortiGate follows a decision tree that determines whether it gets hardware-offloaded or falls to the CPU.

The Session Lifecycle

When a new session arrives:

First packet hits the ingress interface and is processed by the CPU (always -- even NP-capable sessions start here).
Policy lookup occurs -- the CPU matches the packet against firewall policies in order.
Security profile evaluation -- if the matched policy has UTM profiles, the CPU determines what inspection is needed.
Offload decision -- if the session is eligible for NP offload, the CPU programs the NP with the session entry and all subsequent packets are forwarded in hardware.
Ongoing inspection -- if the session cannot be offloaded, it remains on the kernel path or proxy path for the session lifetime.

What Qualifies for NP Offload

Sessions are eligible for NP hardware offload when:

The policy uses flow-based inspection (not proxy)
The ingress and egress interfaces are connected to the same NP chip
The session does not require content inspection that forces CPU processing
NAT is simple (source NAT, destination NAT -- not complex NAT64 with ALG)
The protocol is TCP, UDP, ICMP, or IPsec ESP/AH

What Disqualifies NP Offload

Sessions stay on the CPU when any of these apply:

Disqualifier	Reason
Proxy-mode inspection	Full content buffering requires CPU
Traffic shaping (per-IP bandwidth)	Per-session policing requires CPU tracking
Policy with captive portal	Redirect requires CPU
Ingress/egress on different NP chips	NP cannot forward between chips
Session with ALG (SIP, FTP active)	Application layer gateway requires CPU parsing
Multicast traffic (some platforms)	Not all NP versions support multicast offload
Traffic logging with per-packet detail	Packet-level logging prevents offload
Explicit web proxy sessions	Proxy daemon processes all packets
ZTNA proxy mode	ZTNA proxy terminates sessions on CPU
VoIP/SCCP with ALG enabled	ALG modifies payload, preventing offload

Verify Current Offload Status

# See how many sessions are currently offloaded vs not
diagnose sys session stat
 
# Look for the "npu offloaded" line in output
# Example output:
#   misc info:    session_count=125000 setup_rate=850
#   npu offloaded: 98000
#   non-offloaded: 27000
 
# Check a specific session's offload status
diagnose sys session list | grep "npu_flag"
# npu_flag=set means offloaded, npu_flag=clear means CPU-processed
 
# Get the offload ratio
diagnose npu np6 session-stats

Target: Aim for 70-90%+ of sessions being NP-offloaded in a well-optimized environment. If you are below 50%, there are significant gains available.

Step 2: NP (Network Processor) Optimization

The Network Processor is your primary performance lever. Every session offloaded to the NP runs at hardware speed with zero CPU cost.

Verify NP Offloading is Enabled Globally

# Check global NP acceleration status
config system np6
    show
end
 
# Ensure NP offload is not disabled at the system level
config system settings
    show full | grep "np-offload"
end

If NP offload has been disabled (sometimes done during troubleshooting and never re-enabled), turn it back on:

config system np6
    set fastpath enable
end

Enable NP Offload on Specific Policies

By default, policies allow NP offload. But it can be explicitly disabled per policy. Verify and enable:

# Check a specific policy for auto-asic-offload
config firewall policy
    edit <policy-id>
        show full | grep auto-asic-offload
    next
end
 
# Ensure auto-asic-offload is enabled (default is enable)
config firewall policy
    edit <policy-id>
        set auto-asic-offload enable
    next
end

Warning: Do not blindly enable auto-asic-offload on policies where you need per-packet logging or traffic shaping with per-IP granularity. Those features require CPU processing and will silently lose functionality if forced to NP.

Identify Non-Offloaded Sessions and Root Cause

# List sessions that are NOT offloaded
diagnose sys session list | grep "npu_flag=clear"
 
# For each non-offloaded session, check why
# Look for: proxy mode, traffic shaping, ALG, interface mismatch
diagnose sys session list | grep -B5 -A5 "npu_flag=clear" | head -100
 
# Check which interfaces map to which NP chip
diagnose npu np6 port-list
# Output shows each interface's NP assignment
# Interfaces on different NPs cannot offload sessions between them

NP6 vs NP7 Specific Tuning

NP6 Platforms (FortiGate 200F, 400F, etc.):

# NP6-specific session statistics
diagnose npu np6 session-stats
 
# NP6 VXLAN offload (if using VXLAN)
config system np6
    set vxlan-offload enable
end
 
# NP6 IPsec offload verification
diagnose vpn ike gateway list | grep npu

NP7 Platforms (FortiGate 600F, 1000F, 2600F, 3000F, etc.):

# NP7 enhanced session statistics
diagnose npu np7 session-stats
 
# NP7 supports higher session counts and more offload features
# Verify NP7 capabilities
diagnose npu np7 dce
 
# NP7 hyperscale mode (for platforms that support it)
# Check if hyperscale is available and enabled
config system np7
    show
end

NP Interface Mapping Optimization

For maximum offload, ensure high-traffic flows traverse interfaces on the same NP chip:

# View NP-to-interface mapping
diagnose npu np6 port-list
 
# Example output:
#   np6_0: port1 port2 port3 port4
#   np6_1: port5 port6 port7 port8
#   np6_2: port9 port10 npu0_vlink0 npu0_vlink1
 
# If your primary WAN is on np6_0 and primary LAN is on np6_1,
# traffic between them CANNOT be NP-offloaded between chips
# Solution: Re-cable so WAN and LAN are on the same NP chip

Production Impact: Re-cabling interfaces requires a maintenance window and firewall policy updates. Plan carefully. Verify interface-to-NP mapping BEFORE purchasing and deploying -- it determines your offload ceiling.

Verify NP Optimization Improvements

# After changes, re-check offload ratio
diagnose npu np6 session-stats
# Compare "offloaded" count to your baseline
 
# Monitor in real-time (run for 30+ seconds during peak)
diagnose sys session stat
# Look for npu offloaded count increasing relative to total sessions

Expected Impact: Properly enabling NP offload on eligible sessions can improve throughput by 50-300% for affected traffic flows, while simultaneously reducing CPU utilization by 20-60%.

Step 3: CP (Content Processor) Optimization

The Content Processor handles SSL/TLS termination, IPS signature pattern matching, and encryption/decryption. Offloading these operations to the CP frees the CPU for tasks only the CPU can handle.

Verify CP Status and Utilization

# Check CP utilization
diagnose sys cps stats
 
# CP hardware status
diagnose hardware deviceinfo nic
 
# SSL offload statistics
diagnose vpn ssl stats

CP Offload for SSL Deep Inspection

SSL deep inspection is one of the most CPU-intensive operations. The CP can handle the bulk of the cryptographic work:

# Verify SSL inspection is using CP offload
config firewall ssl-ssh-profile
    edit "deep-inspection"
        show full | grep "ssl-offload"
    next
end
 
# Enable CP offload for SSL inspection (if not already set)
config system global
    set ssl-hw-acceleration enable
end

IPS CP Offload

IPS signature matching can be partially offloaded to the CP for pattern matching acceleration:

# Check IPS engine configuration
config ips global
    show
end
 
# Enable CP-accelerated IPS (default on supported platforms)
config ips global
    set cp-accel-mode advanced
end

Note: Not all IPS signatures can be CP-offloaded. Complex regex patterns and protocol decoder-dependent signatures still require CPU. The CP handles the bulk pattern pre-filtering, reducing CPU load by filtering out non-matching packets before they reach the IPS engine.

CP Load Balancing

On platforms with multiple CP chips, ensure load is distributed:

# Check CP load distribution
diagnose sys cps stats
 
# If one CP is overloaded while others are idle, check SSL profile
# distribution and ensure traffic is not pinned to a single CP

Monitor CP After Changes

# Re-check CP utilization
diagnose sys cps stats
 
# Compare to baseline -- CP utilization should increase
# (more work moved to CP = higher CP use but lower CPU use)
# CPU utilization should decrease correspondingly
get system performance status

Expected Impact: Enabling CP offload for SSL inspection can reduce CPU utilization by 15-40% on environments with heavy HTTPS traffic, while maintaining the same inspection depth.

Step 4: Session Table Optimization

The session table is the core state-tracking structure. Every active connection consumes a session table entry. When the table fills up, new connections are dropped -- even if CPU and memory are available.

Check Current Session Table Configuration

# Current session count and maximum
diagnose sys session stat
 
# Example output:
#   misc info:    session_count=145000 setup_rate=1200
#   ...
#   session table size: 500000
 
# Check configured session table size
config system session-ttl
    show
end

Tune Session Table Size

# Increase session table size based on your platform's memory
# FortiGate 200F: default 2M, max depends on RAM
# FortiGate 600F: default 8M, max depends on RAM
# FortiGate 3000F: default 24M+
 
# Check maximum supported sessions for your platform
get hardware status
 
# Set session table size (requires reboot on some platforms)
config system session-ttl
    set default 3600
end

Session TTL Tuning Per Protocol

Default session TTLs are conservative. Reducing them frees table entries faster:

# View current TTL settings
config system session-ttl
    show
end
 
# Optimize TTLs for common protocols
config system session-ttl
    set default 3600
    config port
        # DNS -- queries are fast, reduce from default
        edit 1
            set protocol 17
            set start-port 53
            set end-port 53
            set timeout 30
        next
        # HTTP -- most pages load within seconds
        edit 2
            set protocol 6
            set start-port 80
            set end-port 80
            set timeout 600
        next
        # HTTPS -- slightly longer for complex web apps
        edit 3
            set protocol 6
            set start-port 443
            set end-port 443
            set timeout 1800
        next
        # ICMP -- ping should not hold sessions for minutes
        edit 4
            set protocol 1
            set start-port 0
            set end-port 65535
            set timeout 15
        next
        # NTP -- one-shot, reduce aggressively
        edit 5
            set protocol 17
            set start-port 123
            set end-port 123
            set timeout 15
        next
    end
end

Warning: Reducing TTLs too aggressively can cause issues with long-lived connections (database connections, SSH tunnels, video conferencing). Test with your specific application mix. Monitor for unexpected session teardowns after changing TTLs.

Session Helper Management

Session helpers (ALGs) create additional sessions for protocols like SIP, FTP, and TFTP. These consume session table entries and prevent NP offload:

# List active session helpers
config system session-helper
    show
end
 
# Disable session helpers for protocols you don't use
# Example: Disable SIP ALG if not using SIP through the FortiGate
config system session-helper
    delete <helper-id-for-sip>
end
 
# Or disable specific helpers
config system settings
    set sip-helper disable
    set sip-nat-trace disable
end

Session Pickup for HA Environments

In High Availability (HA) configurations, session pickup keeps sessions alive during failover but consumes memory:

# Check session pickup status
config system ha
    show full | grep "session-pickup"
end
 
# Enable session pickup for critical protocols only
config system ha
    set session-pickup enable
    set session-pickup-connectionless enable
    set session-pickup-delay enable
end

Asymmetric Routing Handling

Asymmetric routing (traffic entering on one interface and returning on another) can cause session mismatches and performance issues:

# Enable asymmetric routing tolerance if needed
config system settings
    set asymroute enable
end
 
# For specific interfaces with known asymmetric paths
config system interface
    edit "wan1"
        set fail-detect enable
    next
end

Warning: Enabling asymroute globally reduces security because the FortiGate cannot properly track bidirectional session state. Only enable it when asymmetric routing is unavoidable and cannot be fixed at the routing layer.

Verify Session Table Improvements

# After TTL changes, monitor session count over time
diagnose sys session stat
# Session count should be lower for the same traffic load
# (faster TTL expiry = fewer stale sessions consuming entries)
 
# Check session setup rate (should remain stable)
# If setup rate drops, TTLs may be too aggressive
diagnose sys session stat | grep setup_rate

Expected Impact: Optimized session TTLs can reduce active session count by 20-40%, freeing table capacity and reducing memory usage proportionally.

Step 5: UTM Profile Optimization

UTM (Unified Threat Management) profiles are the largest source of CPU consumption. Each inspection feature adds processing overhead. The goal is not to disable security -- it is to right-size inspection depth for your threat model.

IPS Sensor Optimization

IPS is typically the second-most CPU-intensive feature (after SSL inspection). Optimize by reducing the signature set to what is relevant:

# Check current IPS sensor configuration
config ips sensor
    edit "high-performance"
        show
    next
end
 
# Create an optimized IPS sensor that targets your actual environment
config ips sensor
    edit "optimized-sensor"
        # Start with severity-based filtering
        config entries
            # Block critical and high severity -- these are the real threats
            edit 1
                set severity critical high
                set status enable
                set action block
                set log enable
            next
            # Log medium severity for visibility, but pass traffic
            edit 2
                set severity medium
                set status enable
                set action pass
                set log enable
            next
            # Disable low/info signatures -- they generate noise and CPU load
            edit 3
                set severity low info
                set status disable
            next
        end
    next
end

Targeted signature filtering: Disable signatures for software you do not run:

config ips sensor
    edit "optimized-sensor"
        config entries
            # Disable signatures for platforms not in your environment
            edit 4
                set os "Linux"
                set status disable
                # Only if you have NO Linux servers -- be careful here
            next
            # Disable client-side signatures if this is a server farm
            edit 5
                set application "Web_Client"
                set status disable
            next
        end
    next
end

Warning: Disabling IPS signatures creates blind spots. Only disable signatures for platforms and applications that genuinely do not exist in your environment. Maintain a quarterly review of disabled signatures against your actual asset inventory.

Antivirus Scanning Mode

The antivirus scanning mode has a massive impact on performance:

# Check current AV profile
config antivirus profile
    edit "optimized-av"
        show
    next
end
 
# Flow-based scanning is 5-10x faster than proxy-based
config antivirus profile
    edit "optimized-av"
        set scan-mode quick
        set inspection-mode flow-based
        config http
            set av-scan enable
            set outbreak-prevention enable
        end
        config ftp
            set av-scan enable
        end
        config smtp
            set av-scan enable
        end
        # Disable scanning for protocols not in use
        config imap
            set av-scan disable
        end
        config pop3
            set av-scan disable
        end
    next
end

AV Mode	Throughput Impact	Detection Capability
Flow-based (quick scan)	Lowest impact, ~5% throughput reduction	Catches known malware, fast pattern matching
Flow-based (full scan)	Moderate impact, ~15% throughput reduction	Better detection, reassembles packets
Proxy-based	Highest impact, ~40-60% throughput reduction	Full file reconstruction, best detection

Web Filter Performance Tuning

# Use FortiGuard category-based filtering (faster than URL-by-URL)
config webfilter profile
    edit "optimized-webfilter"
        set inspection-mode flow-based
        config ftgd-wf
            # Block categories by group rather than individual URLs
            config filters
                edit 1
                    set category 1
                    set action block
                next
                # ... configure category blocks ...
            end
        end
        # Disable URL rating for known-safe traffic (internal sites)
        set url-extraction-redirects disable
    next
end

Application Control Optimization

# Application control -- use categories, not individual signatures
config application list
    edit "optimized-appctrl"
        config entries
            # Block high-risk categories
            edit 1
                set category 22  # P2P
                set action block
            next
            edit 2
                set category 19  # Proxy
                set action block
            next
            # Monitor (not block) general categories to reduce inspection depth
            edit 3
                set category 5  # General.Interest
                set action pass
                set log enable
            next
        end
    next
end

SSL Deep Inspection Performance Tuning

SSL deep inspection is the single most CPU-intensive feature. Optimize aggressively:

# Check current SSL inspection profile
config firewall ssl-ssh-profile
    edit "deep-inspection"
        show full
    next
end
 
# Optimize SSL inspection
config firewall ssl-ssh-profile
    edit "optimized-ssl"
        # Exempt trusted and high-bandwidth categories from inspection
        config exempt
            edit 1
                set fortiguard-category 31  # Finance/Banking
            next
            edit 2
                set fortiguard-category 33  # Health/Medicine
            next
            # Exempt CDN and update traffic (high volume, low risk)
            edit 3
                set address "Microsoft-365" "Google-Cloud" "AWS" "Akamai"
            next
        end
        # Use certificate inspection for exempted categories
        # (still validates certs without decrypting content)
        set ssl-exemptions-log enable
 
        # Tune TLS version support
        config https
            set ports 443
            set status deep-inspection
            set client-certificate bypass
        end
    next
end

Exemption strategy for SSL inspection:

Category	Recommendation	Reason
Financial sites (banks)	Certificate inspection only	Regulatory risk, these sites have their own security
Health/medical	Certificate inspection only	HIPAA concerns with intercepting PHI
CDN/cloud updates	Exempt	High bandwidth, low threat, verified publishers
SaaS (M365, Google)	Exempt or certificate-only	Already inspected by cloud security stack
Everything else	Deep inspection	This is where threats hide

Warning: Every SSL exemption is a potential blind spot. Document all exemptions and review them quarterly. Ensure you have compensating controls (EDR, cloud-native security) for exempted traffic.

Verify UTM Optimization

# Check CPU impact before and after UTM changes
get system performance status
 
# Monitor IPS engine specifically
diagnose ips session status
 
# Check flow vs proxy session ratio
diagnose sys session stat
# Fewer proxy sessions = more efficient

Expected Impact: Moving from proxy-based to flow-based inspection and optimizing IPS signatures can improve UTM throughput by 100-300% while maintaining 85-95% of detection capability.

Step 6: Firewall Policy Optimization

Policy lookup happens for every new session. The FortiGate evaluates policies top-to-bottom until a match is found. With hundreds of policies, this lookup becomes a measurable performance cost.

Policy Ordering for Performance

Place the most frequently matched policies at the top of the policy list:

# Identify your most-hit policies
diagnose firewall iprope lookup <src-ip> <dst-ip> <src-port> <dst-port> <protocol>
 
# View policy hit counts
get firewall policy | grep -A2 "policyid"
 
# Or use the GUI: Policy & Objects > Firewall Policy > Column Settings > Enable "Hit Count"
 
# Move high-hit policies to the top
config firewall policy
    move <high-hit-policy-id> before <first-policy-id>
end

Policy Consolidation

Consolidate overlapping or similar policies to reduce the total count:

# Before: 5 policies for different servers, same inspection profile
# Policy 10: srcaddr=any, dstaddr=WebServer1, service=HTTPS, action=accept
# Policy 11: srcaddr=any, dstaddr=WebServer2, service=HTTPS, action=accept
# Policy 12: srcaddr=any, dstaddr=WebServer3, service=HTTPS, action=accept
# Policy 13: srcaddr=any, dstaddr=WebServer4, service=HTTPS, action=accept
# Policy 14: srcaddr=any, dstaddr=WebServer5, service=HTTPS, action=accept
 
# After: 1 policy using an address group
config firewall addrgrp
    edit "WebServers"
        set member "WebServer1" "WebServer2" "WebServer3" "WebServer4" "WebServer5"
    next
end
 
config firewall policy
    edit <policy-id>
        set srcaddr "all"
        set dstaddr "WebServers"
        set service "HTTPS"
        set action accept
        set utm-status enable
        set auto-asic-offload enable
    next
end

Interface-Based vs Zone-Based Policies

Zone-based policies can simplify and reduce policy count, but interface-based policies enable more granular NP offload:

# Interface-based approach (better for NP offload)
config firewall policy
    edit <policy-id>
        set srcintf "port1"
        set dstintf "port3"
        # NP can offload if port1 and port3 are on the same NP chip
    next
end
 
# Zone-based approach (fewer policies, easier management)
config system zone
    edit "LAN-zone"
        set interface "port3" "port4" "port5"
    next
end

Recommendation: Use interface-based policies for high-throughput paths where NP offload matters. Use zones for management and lower-volume traffic where simplicity is more valuable than offload.

Schedule-Based Policies

Use scheduled policies to apply heavy UTM inspection only during business hours:

# Create a business hours schedule
config firewall schedule recurring
    edit "business-hours"
        set day monday tuesday wednesday thursday friday
        set start 07:00
        set end 19:00
    next
end
 
# Apply full UTM during business hours, lighter inspection off-hours
config firewall policy
    edit <business-policy-id>
        set schedule "business-hours"
        set utm-status enable
        set ips-sensor "full-sensor"
        set av-profile "full-av"
    next
    edit <offhours-policy-id>
        set schedule "always"
        set utm-status enable
        set ips-sensor "optimized-sensor"
        set av-profile "optimized-av"
    next
end

Reduce Policy Lookup Time

# Enable policy caching (enabled by default, verify it is not disabled)
config system settings
    show full | grep "policy-auth-concurrent"
end
 
# Check policy count
get firewall policy | grep -c "policyid"
 
# If you have 500+ policies, consider:
# 1. Consolidating with address groups
# 2. Using VDOMs to segment policy tables
# 3. Removing disabled/unused policies

Expected Impact: Reducing policy count by 40-60% through consolidation and optimizing ordering can reduce per-session policy lookup time by 30-50%, which matters at high session setup rates (10,000+ new sessions/second).

Step 7: SD-WAN Performance Tuning

SD-WAN adds path intelligence but also adds overhead -- health probes, quality measurements, and rule evaluation all consume resources. Optimize to get the path intelligence without the performance penalty.

SLA Probe Optimization

Health check probes run continuously on every SD-WAN member. Each probe consumes bandwidth and generates sessions:

# View current SLA probes
config system sdwan
    config health-check
        show
    end
end
 
# Optimize probe frequency -- default 500ms is aggressive
config system sdwan
    config health-check
        edit "wan-sla"
            # Increase interval from 500ms to 1000ms for less overhead
            set interval 1000
            # Reduce probe count (default 30 is high)
            set probe-count 10
            # Use appropriate protocol
            set protocol ping
            set server "8.8.8.8" "1.1.1.1"
            # Set failure threshold
            set failtime 3
            # Set recovery threshold
            set recoverytime 5
        next
    end
end

Warning: Increasing probe intervals slows down failover detection. A 1000ms interval with failtime=3 means 3 seconds minimum to detect a link failure. For real-time applications (VoIP, video), keep intervals at 500ms. For general traffic, 1000-2000ms is acceptable.

Performance SLA Thresholds

Set realistic SLA thresholds that trigger path changes without causing flaps:

config system sdwan
    config health-check
        edit "wan-sla"
            config sla
                edit 1
                    # Latency threshold in ms
                    set latency-threshold 100
                    # Jitter threshold in ms
                    set jitter-threshold 30
                    # Packet loss threshold in percent
                    set packetloss-threshold 2
                next
                edit 2
                    # Degraded SLA -- still usable but not preferred
                    set latency-threshold 200
                    set jitter-threshold 50
                    set packetloss-threshold 5
                next
            end
        next
    end
end

Load Balancing Algorithm Selection

Choose the right algorithm for your traffic pattern:

config system sdwan
    config service
        edit 1
            set name "critical-apps"
            # source-ip-based: consistent hashing, same source always same path
            # Best for: most environments, session persistence
            set load-balance-mode source-ip-based
 
            # volume-based: distribute by bandwidth consumption
            # Best for: environments with mixed large/small flows
            # set load-balance-mode volume-based
 
            # session-based: round-robin per session
            # Best for: many small sessions, maximum distribution
            # set load-balance-mode sessions
 
            # source-dest-ip-based: hash on both src and dst
            # Best for: environments needing session affinity
            # set load-balance-mode source-dest-ip-based
        next
    end
end

Algorithm	Best For	Overhead
source-ip-based	General use, session persistence	Low
source-dest-ip-based	Asymmetric traffic avoidance	Low
sessions	Maximum path utilization	Medium
volume-based	Balanced bandwidth distribution	Medium
measured-volume-based	Dynamic bandwidth-aware balancing	Higher

SD-WAN Rule Ordering

Like firewall policies, SD-WAN rules are evaluated in order. Place the most specific and most-hit rules first:

config system sdwan
    config service
        # Rule 1: VoIP -- most latency sensitive, most specific
        edit 1
            set name "voip"
            set dst "voip-servers"
            set internet-service enable
            set internet-service-app-ctrl 41468  # Microsoft Teams
            set priority-members 1  # Best latency link
            set sla "wan-sla"
            set sla-compare-method number
        next
        # Rule 2: Critical SaaS
        edit 2
            set name "saas-critical"
            set internet-service enable
            set internet-service-app-ctrl 16354 16355  # M365
            set load-balance-mode source-ip-based
        next
        # Rule 3: General traffic (catch-all)
        edit 3
            set name "default"
            set dst "all"
            set load-balance-mode source-ip-based
        next
    end
end

Verify SD-WAN Optimization

# Check SD-WAN link quality
diagnose sys sdwan health-check
 
# Verify traffic distribution
diagnose sys sdwan service
 
# Monitor SLA status
diagnose sys sdwan intf-sla-log
 
# Check for unnecessary failovers (flapping)
diagnose sys sdwan log

Expected Impact: Optimized SLA probes and rule ordering reduce SD-WAN overhead by 10-20% of WAN bandwidth consumed by probes, and improve failover time predictability.

Step 8: Memory & Conserve Mode Prevention

Conserve mode is FortiGate's emergency response to memory exhaustion. When triggered, the device stops accepting new sessions, drops proxy-mode connections, and degrades security inspection. Preventing conserve mode is a critical operational priority.

Understanding Conserve Mode Triggers

# Check current memory status and thresholds
diagnose hardware sysinfo memory
 
# View conserve mode history
diagnose sys top 1 20
 
# Check if currently in conserve mode
get system performance status | grep "Memory"
# Look for: "Memory states: normal" (good) vs "Memory states: conserve" (bad)
 
# View conserve mode thresholds
config system global
    show full | grep conserve
end

Conserve Mode Thresholds

FortiGate enters conserve mode based on memory thresholds:

Threshold	Default	Behavior
Green (normal)	< 82% memory used	Normal operation
Red (conserve)	> 88% memory used	Conserve mode enters -- new sessions may be dropped
Extreme (kernel conserve)	> 95% memory used	Kernel-level session dropping, proxy daemons stopped

Tune Conserve Mode Thresholds

# Adjust thresholds (use with caution)
config system global
    # Memory watermark thresholds
    set av-failopen pass
    set ips-affinity "0"
end
 
# Set proxy resource limits to prevent proxy from consuming all memory
config firewall profile-protocol-options
    edit "optimized"
        config http
            # Limit oversized HTTP content (prevents memory exhaustion)
            set oversize-limit 10
            set uncompressed-oversize-limit 12
        end
        config ftp
            set oversize-limit 10
            set uncompressed-oversize-limit 12
        end
    next
end

Memory Usage Monitoring

# Real-time memory breakdown
diagnose sys top 1 30
 
# Check which processes consume the most memory
diagnose sys process list | sort -rn -k 4 | head 20
 
# Monitor proxy memory usage specifically
diagnose test application wad 1
 
# Session memory usage
diagnose sys session stat | grep "memory"
 
# Check for memory leaks (increasing usage without corresponding session increase)
# Run periodically and compare:
diagnose hardware sysinfo memory

Proxy Memory Limit Configuration

# Limit memory available to proxy daemons
config system global
    # Set proxy memory limit as percentage of total memory
    set proxy-resource-mode enable
end
 
# Tune individual proxy limits
config firewall profile-protocol-options
    edit "optimized"
        config http
            set oversize-limit 10
            set uncompressed-oversize-limit 12
            # Block oversized content rather than buffering it
            set block-page-status-code 403
        end
    next
end

Conserve Mode Prevention Checklist

Monitor memory continuously -- set SNMP traps at 75% memory usage
Reduce session TTLs (see Step 4) -- fewer stale sessions = less memory
Use flow-based inspection where possible -- proxy-based buffers content in memory
Limit oversized content -- a single 2 GB download in proxy mode can consume significant memory
Size your FortiGate correctly -- if you are consistently above 70% memory, you need a bigger box

# Set SNMP memory threshold trap
config system snmp sysinfo
    set status enable
end
config system snmp community
    edit 1
        set events cpu-high mem-low
    next
end

Expected Impact: Proper memory management and proxy resource limits can prevent conserve mode entirely. The goal is to keep steady-state memory usage below 70%.

Step 9: Interface & Routing Optimization

Network interface configuration and routing efficiency directly impact throughput and latency.

Interface MTU Optimization

# Check current MTU on all interfaces
get system interface | grep mtu
 
# Set optimal MTU (typically 1500 for Ethernet, 9000 for jumbo frames)
config system interface
    edit "port1"
        # Standard Ethernet MTU
        set mtu 1500
        # Override default MTU if path supports jumbo frames
        set mtu-override enable
        # For data center internal traffic:
        # set mtu 9000
    next
end

Warning: Setting MTU to 9000 (jumbo frames) requires that every device in the path supports jumbo frames. A single device with standard 1500 MTU will cause fragmentation, which is worse than using 1500 everywhere.

TCP MSS Clamping

Prevent fragmentation on VPN and encapsulated traffic:

# Set TCP MSS clamping on VPN interfaces
config vpn ipsec phase1-interface
    edit "to-azure"
        set auto-negotiate enable
    next
end
 
# Per-policy MSS clamping
config firewall policy
    edit <vpn-policy-id>
        set tcp-mss-sender 1350
        set tcp-mss-receiver 1350
    next
end

Flow Control Settings

# Check interface flow control
get system interface physical | grep flow
 
# Enable flow control to prevent packet drops at ingress
config system interface
    edit "port1"
        set speed auto
        set pause-meter-rate 0
    next
end
 
# For high-throughput interfaces, verify duplex and speed
config system interface
    edit "port1"
        show full | grep speed
        # Ensure auto-negotiation is working or force speed if needed
        # set speed 10000full  # Force 10G full duplex
    next
end

Link Aggregation (LACP)

Aggregate multiple physical links for increased bandwidth and redundancy:

# Create an LACP aggregate interface
config system interface
    edit "agg1"
        set type aggregate
        set member "port1" "port2"
        set lacp-mode active
        set lacp-ha-slave disable
        set min-links 1
        set algorithm L3  # Hash based on src/dst IP for better distribution
        # L4 for src/dst IP + port (best distribution for mixed traffic)
        # set algorithm L4
    next
end

LACP Algorithm	Hash Based On	Best For
L2	Source/Dest MAC	Single subnet traffic
L3	Source/Dest IP	Multi-subnet traffic
L4	Source/Dest IP + Port	Mixed traffic (recommended)

ECMP Routing

Equal-Cost Multi-Path routing distributes traffic across multiple next-hops:

# Configure ECMP with multiple static routes
config router static
    edit 1
        set dst 0.0.0.0/0
        set gateway 10.0.1.1
        set device "port1"
        set distance 10
    next
    edit 2
        set dst 0.0.0.0/0
        set gateway 10.0.2.1
        set device "port2"
        set distance 10
    next
end
 
# Set ECMP load balancing method
config system settings
    # source-ip-based (default, recommended)
    set ecmp-max-paths 4
end
 
# Verify ECMP is active
get router info routing-table all | grep "0.0.0.0"
# Should show multiple equal-cost routes

Routing Table Optimization

# Check routing table size
get router info routing-table details | wc -l
 
# For BGP environments, implement route aggregation
config router bgp
    config aggregate-address
        edit 1
            set prefix 10.0.0.0 255.0.0.0
        next
    end
end
 
# Set appropriate route cache timeout
config system global
    set ip-src-port-range 1024-25000
end

Expected Impact: LACP with L4 hashing doubles effective bandwidth. ECMP routing provides linear bandwidth scaling per additional path. MTU optimization can improve throughput by 5-15% for large transfer workloads.

Step 10: Logging & Diagnostics Impact

Logging is essential for visibility and compliance, but excessive logging is a measurable performance drain. The goal is to log what matters without logging everything.

Logging Performance Impact

Each log message generated by the FortiGate consumes:

CPU cycles to format and write the log
Memory for log buffering
Disk I/O (if logging to disk)
Network bandwidth (if logging to FortiAnalyzer or syslog)

At high session rates, logging can consume 5-15% of CPU.

Disk vs Memory vs FortiAnalyzer Logging

# Check current logging configuration
show log setting
show log fortianalyzer setting
show log syslogd setting
show log disk setting
 
# Recommended: Log to FortiAnalyzer or syslog, not local disk
# Local disk logging creates I/O bottlenecks on high-throughput devices
config log disk setting
    set status disable
    set max-log-file-size 100
end
 
# FortiAnalyzer logging (recommended)
config log fortianalyzer setting
    set status enable
    set server <faz-ip>
    set enc-algorithm high
    set reliable enable
    set upload-option realtime
end
 
# If using syslog as alternative
config log syslogd setting
    set status enable
    set server <syslog-ip>
    set port 514
    set facility local7
    set format rfc5424
end

Reduce Log Volume Without Losing Visibility

# Log only denied traffic and UTM events (not all allowed traffic)
config firewall policy
    edit <high-volume-allow-policy>
        set logtraffic utm
        # Options:
        # all -- log every session (highest overhead)
        # utm -- log only UTM events (moderate overhead, good visibility)
        # disable -- no logging (lowest overhead, bad visibility)
    next
end
 
# For high-volume internal policies, consider logging only at session start
config firewall policy
    edit <internal-policy>
        set logtraffic utm
        set logtraffic-start disable
    next
end
 
# Reduce severity level for routine traffic
config log setting
    set fwpolicy-implicit-log enable
    set local-in-allow disable
    set local-in-deny-broadcast disable
    set local-in-deny-unicast disable
    set log-invalid-packet disable
end

Performance Diagnostic Commands

These commands help identify performance issues in real-time:

# Top processes by CPU
diagnose sys top 1 20
 
# Real-time throughput per interface
diagnose netlink interface list
 
# Packet flow debug (use sparingly -- this impacts performance!)
diagnose debug flow filter addr <suspect-ip>
diagnose debug flow show function-name enable
diagnose debug flow trace start 100
# IMPORTANT: Always stop the trace when done
diagnose debug flow trace stop
diagnose debug disable
 
# Crash log check (performance issues sometimes correlate with crashes)
diagnose debug crashlog read
 
# Hardware sensor readings (thermal throttling check)
execute sensor list
 
# NP error counters
diagnose npu np6 sse-stats | grep -i error
diagnose npu np6 sse-stats | grep -i drop

Warning: diagnose debug flow traces are extremely CPU-intensive. Never enable flow traces on a production device during peak hours without limiting the filter scope and packet count. Always stop traces when done. Failing to stop a debug trace is a common cause of performance degradation.

Verify Logging Optimization

# Check log generation rate
execute log filter device memory
execute log display
# Look at timestamps -- high-volume logging will show hundreds of entries per second
 
# Compare CPU before and after logging changes
get system performance status

Expected Impact: Moving from logtraffic all to logtraffic utm on high-volume policies can reduce CPU overhead by 5-15% and dramatically reduce FortiAnalyzer storage requirements.

Verification

After completing optimization steps, run a comprehensive verification to measure improvements against your baseline.

Post-Optimization Metrics Collection

# Collect the same metrics as your pre-optimization baseline
 
# System performance
get system performance status
 
# CPU breakdown
diagnose sys top 1 20
 
# Memory status
diagnose hardware sysinfo memory
 
# Session statistics
diagnose sys session stat
 
# NP offload ratio
diagnose npu np6 session-stats
 
# CP utilization
diagnose sys cps stats
 
# Interface throughput
diagnose netlink interface list
 
# Verify no conserve mode events
get system performance status | grep "Memory"

Before/After Comparison

Metric	Baseline	After Optimization	Change
CPU Usage (average)	___%	___%	-___%
Memory Usage	___%	___%	-___%
Active Sessions	___	___	-___
NP Offloaded Sessions	___%	___%	+___%
Firewall Throughput	___ Gbps	___ Gbps	+___%
CP Utilization	___%	___%	+___%
Policy Count	___	___	-___

Ongoing Monitoring

Set up persistent monitoring to track performance over time:

# Configure SNMP monitoring for CPU/memory thresholds
config system snmp sysinfo
    set status enable
    set engine-id "local"
    set contact-info "noc@yourcompany.com"
end
 
config system snmp community
    edit 1
        set name "<community-string>"
        set events cpu-high mem-low log-full intf-ip fm-conf-change
        config hosts
            edit 1
                set ip <monitoring-server-ip> 255.255.255.255
            next
        end
    next
end
 
# Enable performance logging to FortiAnalyzer
config log fortianalyzer setting
    set status enable
    set upload-option realtime
end

Validate Security Posture

Performance optimization must not compromise security. Verify:

# Confirm IPS is still active and detecting
diagnose ips session status
 
# Verify AV engine is running and updating
diagnose autoupdate status
diagnose autoupdate versions
 
# Check SSL inspection is functional
diagnose test application ssl 1
 
# Verify firewall policies are still matching correctly
diagnose firewall iprope lookup <test-src-ip> <test-dst-ip> <port> <port> <proto>

Troubleshooting

NP Offload Failures

Symptom: NP offloaded session count is low despite eligible traffic.

# Check NP health
diagnose npu np6 sse-stats
# Look for error counters, session table full, or hardware errors
 
# Verify interface-to-NP mapping
diagnose npu np6 port-list
# Ensure ingress and egress interfaces are on the same NP
 
# Check for NP session table exhaustion
diagnose npu np6 session-stats | grep "max"
# If current count is near max, the NP session table is full
 
# Check policy settings
config firewall policy
    edit <policy-id>
        show full | grep auto-asic-offload
    next
end
# Ensure auto-asic-offload is "enable"

Resolution:

Verify interfaces are on the same NP chip
Enable auto-asic-offload on applicable policies
Move to flow-based inspection (proxy mode prevents NP offload)
Disable unnecessary session helpers/ALGs
Check for firmware bugs in your FortiOS version release notes

Conserve Mode

Symptom: FortiGate enters conserve mode, new sessions dropped, users report connectivity failures.

# Check current memory state
get system performance status
 
# Identify memory consumers
diagnose sys top 1 30
# Look for processes using >20% memory: wad (proxy), ipsengine, scanunitd
 
# Check session count (excessive sessions consume memory)
diagnose sys session stat
 
# Check for memory leaks
diagnose hardware sysinfo memory
# Run multiple times over 10 minutes -- if usage only goes up, leak suspected

Resolution:

Reduce session TTLs to free stale sessions
Switch from proxy-mode to flow-mode inspection
Limit oversized content buffering in protocol options
Add memory (if hardware supports DIMM upgrade)
If recurring, right-size to a larger FortiGate model
As emergency action: diagnose sys session clear (clears ALL sessions -- use only in emergencies)

Warning: diagnose sys session clear drops every active session on the FortiGate, including management sessions. You will be disconnected. All users will experience a brief outage as sessions re-establish. Only use this as a last resort to exit conserve mode.

Session Table Exhaustion

Symptom: New sessions are dropped even though CPU and memory appear normal.

# Check session count vs maximum
diagnose sys session stat
# If session_count is at or near the maximum, the table is full
 
# Identify what is filling the session table
diagnose sys session list | awk '{print $4}' | sort | uniq -c | sort -rn | head 20
# This shows source IPs with the most sessions -- could indicate a scan or worm
 
# Check for session table DoS
diagnose sys session stat | grep "clash"
# High clash count indicates hash table collisions (possible DoS)

Resolution:

Reduce session TTLs (see Step 4)
Identify and block source IPs with excessive sessions (possible compromise or misconfiguration)
Increase session table size if hardware supports it
Enable DoS policy to limit per-source session counts:

config firewall DoS-policy
    edit 1
        set interface "wan1"
        config anomaly
            edit "tcp_src_session"
                set status enable
                set action block
                set threshold 1000
            next
        end
    next
end

Asymmetric Routing Issues

Symptom: Sessions are being dropped or logged as denied, but the firewall policy should allow them. Often seen with ECMP, multiple ISPs, or HA configurations.

# Check for asymmetric routing indicators in logs
execute log filter field subtype forward
execute log filter field action deny
execute log display
 
# Look for "reverse path check fail" or "no matching policy" for traffic
# that should be allowed
 
# Enable loose RPF (Reverse Path Forwarding) check
config system settings
    set strict-src-check disable
end
 
# Or enable asymmetric routing
config system settings
    set asymroute enable
end

Resolution:

Fix routing to be symmetric where possible (preferred)
Enable asymroute only when asymmetric paths are unavoidable
For ECMP: ensure both paths return via the same FortiGate (or use HA)
For multi-ISP: use policy routing to ensure return traffic matches the ingress path

SD-WAN Path Flapping

Symptom: SD-WAN continuously switches between paths, causing session disruptions.

# Check SLA status history
diagnose sys sdwan health-check
 
# Look for rapid state changes
diagnose sys sdwan intf-sla-log
 
# Check if thresholds are too tight
config system sdwan
    config health-check
        edit "wan-sla"
            show
        next
    end
end

Resolution:

Increase probe intervals to smooth out momentary blips
Widen SLA thresholds (e.g., latency from 50ms to 100ms)
Increase failtime and recoverytime to require sustained failure/recovery before switching
Use hold-down-time to prevent rapid re-switching:

config system sdwan
    config health-check
        edit "wan-sla"
            set failtime 5
            set recoverytime 10
            set interval 1000
        next
    end
end

High CPU from IPS Engine

Symptom: ipsengine process consuming 60%+ CPU.

# Check IPS engine status
diagnose ips session status
 
# Identify which signatures are firing most
diagnose ips anomaly list
 
# Check if IPS database is current
diagnose autoupdate versions | grep -A3 "IPS"

Resolution:

Optimize IPS sensor (see Step 5) -- remove irrelevant signatures
Enable CP offload for IPS pattern matching
Switch from proxy-based to flow-based IPS inspection
If a specific signature is firing excessively on legitimate traffic, create an IPS exemption:

config ips sensor
    edit "optimized-sensor"
        config entries
            edit <entry-id>
                config exempt-ip
                    edit 1
                        set src-ip <legitimate-source>
                    next
                end
            next
        end
    next
end

Final Note: Performance optimization is not a one-time task. Traffic patterns change, new applications are deployed, firmware updates alter behavior, and session loads grow. Schedule quarterly performance reviews using the baseline methodology documented in this guide. Compare current metrics to your original baseline and to the post-optimization measurements. Adjust as needed. The FortiGate that was optimized for today's traffic may need re-tuning for next quarter's growth.

Overview

FortiGate ASIC Architecture

FortiGate devices use a purpose-built Security Processing Unit (SPU) architecture with three key components:

Processor	Full Name	Function	Examples
NP	Network Processor	Hardware-accelerated packet forwarding, NAT, IPsec, VXLAN encap/decap	NP6, NP6XLite, NP7
CP	Content Processor	SSL/TLS offload, IPS pattern matching, encryption/decryption	CP9
SP/CPU	Security Processor / Main CPU	Flow-based and proxy-based UTM inspection, management plane, logging	ARM/x86 cores

The optimization goal is straightforward: maximize the percentage of traffic processed by NP and CP hardware, and minimize what falls to the CPU.

Traffic Flow Through FortiGate

Traffic follows one of three paths:

Fast Path (NP Offload) -- After the first few packets of a session are inspected by the CPU and a policy match is found, the session is offloaded to the NP for hardware-accelerated forwarding. This is the highest-performance path. No CPU involvement after offload.
Kernel Path (Software Fast Path) -- Sessions that cannot be NP-offloaded but do not require proxy-mode inspection. The kernel handles forwarding in software. Performance is good but not wire-speed.
Proxy Path (CPU) -- Sessions requiring deep content inspection (proxy-mode antivirus, explicit web proxy, SSL deep inspection without CP offload). Every packet passes through the CPU. This is the slowest path and the primary bottleneck under load.

What This Guide Covers:

Area	Optimization Target
NP Offload	Maximize hardware-accelerated session count
CP Offload	Offload SSL and IPS to content processors
Session Table	Right-size table, tune TTLs, reduce waste
UTM Profiles	Balance security depth with throughput
Firewall Policies	Reduce lookup time, optimize ordering
SD-WAN	Tune probes, thresholds, load balancing
Memory	Prevent conserve mode, tune thresholds
Interfaces	MTU, LACP, ECMP, flow control
Logging	Reduce logging overhead without losing visibility

Warning: Performance tuning changes can affect security inspection depth and traffic flow behavior. Always benchmark before and after each change. Test in a lab or maintenance window first. Document every change for audit and rollback purposes.

Scenario

This guide addresses four common performance challenges:

Regardless of scenario, the methodology is the same: baseline current performance, identify bottlenecks, apply targeted optimizations, verify improvements, and document changes.

Pre-Optimization Assessment

Before making any changes, capture comprehensive baseline metrics. You need to know where you are to measure improvement.

System-Level Metrics

# Overall system status -- CPU, memory, firmware, uptime
get system status
 
# Real-time CPU usage per core
get system performance status
 
# Detailed CPU usage breakdown (user, system, idle, interrupt)
diagnose sys top 1 20
 
# Memory usage summary
diagnose hardware sysinfo memory
 
# Current session count vs maximum
diagnose sys session stat
 
# Full session table statistics
diagnose sys session full-stat

NP (Network Processor) Counters

# NP6/NP7 session offload statistics
diagnose npu np6 session-stats
# For NP7-based platforms:
diagnose npu np7 session-stats
 
# NP hardware session count
diagnose npu np6 sse-stats
 
# NP packet counters (drops, errors, throughput)
diagnose npu np6 port-list
 
# Check NP fast-path utilization
diagnose npu np6 dce
 
# Identify sessions NOT offloaded (and why)
diagnose npu np6 sse-stats | grep offload

Interface Throughput

# Real-time interface throughput (run for 10+ seconds to get stable readings)
diagnose netlink interface list
 
# Detailed interface counters -- packets, bytes, errors, drops
fnsysctl ifconfig <interface-name>
 
# Per-VDOM interface stats (if VDOMs enabled)
diagnose netlink interface list | grep -A5 <interface>

Content Processor Metrics

# CP utilization percentage
diagnose sys cps stats
 
# SSL offload statistics
diagnose vpn ssl stats
 
# IPS engine status and packet counts
diagnose ips session status

Record Your Baseline

Create a baseline document with these values:

Metric	Command	Your Baseline Value
CPU Usage (average)	`get system performance status`	___%
Memory Usage	`get system performance status`	___%
Active Sessions	`diagnose sys session stat`	___
Max Sessions	`diagnose sys session stat`	___
NP Offloaded Sessions	`diagnose npu np6 session-stats`	___%
NP Drops	`diagnose npu np6 sse-stats`	___
Interface Throughput (peak)	`diagnose netlink interface list`	___ Gbps
CP Utilization	`diagnose sys cps stats`	___%

Tip: Run baseline collection during peak traffic hours, not at 3 AM. Performance tuning is meaningless if you baseline during idle periods. Collect data for at least 30 minutes to capture representative load patterns.

Step 1: Understanding the Traffic Flow

The Session Lifecycle

When a new session arrives:

First packet hits the ingress interface and is processed by the CPU (always -- even NP-capable sessions start here).
Policy lookup occurs -- the CPU matches the packet against firewall policies in order.
Security profile evaluation -- if the matched policy has UTM profiles, the CPU determines what inspection is needed.
Offload decision -- if the session is eligible for NP offload, the CPU programs the NP with the session entry and all subsequent packets are forwarded in hardware.
Ongoing inspection -- if the session cannot be offloaded, it remains on the kernel path or proxy path for the session lifetime.

What Qualifies for NP Offload

Sessions are eligible for NP hardware offload when:

The policy uses flow-based inspection (not proxy)
The ingress and egress interfaces are connected to the same NP chip
The session does not require content inspection that forces CPU processing
NAT is simple (source NAT, destination NAT -- not complex NAT64 with ALG)
The protocol is TCP, UDP, ICMP, or IPsec ESP/AH

What Disqualifies NP Offload

Sessions stay on the CPU when any of these apply:

Disqualifier	Reason
Proxy-mode inspection	Full content buffering requires CPU
Traffic shaping (per-IP bandwidth)	Per-session policing requires CPU tracking
Policy with captive portal	Redirect requires CPU
Ingress/egress on different NP chips	NP cannot forward between chips
Session with ALG (SIP, FTP active)	Application layer gateway requires CPU parsing
Multicast traffic (some platforms)	Not all NP versions support multicast offload
Traffic logging with per-packet detail	Packet-level logging prevents offload
Explicit web proxy sessions	Proxy daemon processes all packets
ZTNA proxy mode	ZTNA proxy terminates sessions on CPU
VoIP/SCCP with ALG enabled	ALG modifies payload, preventing offload

Verify Current Offload Status

# See how many sessions are currently offloaded vs not
diagnose sys session stat
 
# Look for the "npu offloaded" line in output
# Example output:
#   misc info:    session_count=125000 setup_rate=850
#   npu offloaded: 98000
#   non-offloaded: 27000
 
# Check a specific session's offload status
diagnose sys session list | grep "npu_flag"
# npu_flag=set means offloaded, npu_flag=clear means CPU-processed
 
# Get the offload ratio
diagnose npu np6 session-stats

Target: Aim for 70-90%+ of sessions being NP-offloaded in a well-optimized environment. If you are below 50%, there are significant gains available.

Step 2: NP (Network Processor) Optimization

The Network Processor is your primary performance lever. Every session offloaded to the NP runs at hardware speed with zero CPU cost.

Verify NP Offloading is Enabled Globally

# Check global NP acceleration status
config system np6
    show
end
 
# Ensure NP offload is not disabled at the system level
config system settings
    show full | grep "np-offload"
end

If NP offload has been disabled (sometimes done during troubleshooting and never re-enabled), turn it back on:

config system np6
    set fastpath enable
end

Enable NP Offload on Specific Policies

By default, policies allow NP offload. But it can be explicitly disabled per policy. Verify and enable:

# Check a specific policy for auto-asic-offload
config firewall policy
    edit <policy-id>
        show full | grep auto-asic-offload
    next
end
 
# Ensure auto-asic-offload is enabled (default is enable)
config firewall policy
    edit <policy-id>
        set auto-asic-offload enable
    next
end

Warning: Do not blindly enable auto-asic-offload on policies where you need per-packet logging or traffic shaping with per-IP granularity. Those features require CPU processing and will silently lose functionality if forced to NP.

Identify Non-Offloaded Sessions and Root Cause

# List sessions that are NOT offloaded
diagnose sys session list | grep "npu_flag=clear"
 
# For each non-offloaded session, check why
# Look for: proxy mode, traffic shaping, ALG, interface mismatch
diagnose sys session list | grep -B5 -A5 "npu_flag=clear" | head -100
 
# Check which interfaces map to which NP chip
diagnose npu np6 port-list
# Output shows each interface's NP assignment
# Interfaces on different NPs cannot offload sessions between them

NP6 vs NP7 Specific Tuning

NP6 Platforms (FortiGate 200F, 400F, etc.):

# NP6-specific session statistics
diagnose npu np6 session-stats
 
# NP6 VXLAN offload (if using VXLAN)
config system np6
    set vxlan-offload enable
end
 
# NP6 IPsec offload verification
diagnose vpn ike gateway list | grep npu

NP7 Platforms (FortiGate 600F, 1000F, 2600F, 3000F, etc.):

# NP7 enhanced session statistics
diagnose npu np7 session-stats
 
# NP7 supports higher session counts and more offload features
# Verify NP7 capabilities
diagnose npu np7 dce
 
# NP7 hyperscale mode (for platforms that support it)
# Check if hyperscale is available and enabled
config system np7
    show
end

NP Interface Mapping Optimization

For maximum offload, ensure high-traffic flows traverse interfaces on the same NP chip:

# View NP-to-interface mapping
diagnose npu np6 port-list
 
# Example output:
#   np6_0: port1 port2 port3 port4
#   np6_1: port5 port6 port7 port8
#   np6_2: port9 port10 npu0_vlink0 npu0_vlink1
 
# If your primary WAN is on np6_0 and primary LAN is on np6_1,
# traffic between them CANNOT be NP-offloaded between chips
# Solution: Re-cable so WAN and LAN are on the same NP chip

Production Impact: Re-cabling interfaces requires a maintenance window and firewall policy updates. Plan carefully. Verify interface-to-NP mapping BEFORE purchasing and deploying -- it determines your offload ceiling.

Verify NP Optimization Improvements

# After changes, re-check offload ratio
diagnose npu np6 session-stats
# Compare "offloaded" count to your baseline
 
# Monitor in real-time (run for 30+ seconds during peak)
diagnose sys session stat
# Look for npu offloaded count increasing relative to total sessions

Expected Impact: Properly enabling NP offload on eligible sessions can improve throughput by 50-300% for affected traffic flows, while simultaneously reducing CPU utilization by 20-60%.

Step 3: CP (Content Processor) Optimization

The Content Processor handles SSL/TLS termination, IPS signature pattern matching, and encryption/decryption. Offloading these operations to the CP frees the CPU for tasks only the CPU can handle.

Verify CP Status and Utilization

# Check CP utilization
diagnose sys cps stats
 
# CP hardware status
diagnose hardware deviceinfo nic
 
# SSL offload statistics
diagnose vpn ssl stats

CP Offload for SSL Deep Inspection

SSL deep inspection is one of the most CPU-intensive operations. The CP can handle the bulk of the cryptographic work:

# Verify SSL inspection is using CP offload
config firewall ssl-ssh-profile
    edit "deep-inspection"
        show full | grep "ssl-offload"
    next
end
 
# Enable CP offload for SSL inspection (if not already set)
config system global
    set ssl-hw-acceleration enable
end

IPS CP Offload

IPS signature matching can be partially offloaded to the CP for pattern matching acceleration:

# Check IPS engine configuration
config ips global
    show
end
 
# Enable CP-accelerated IPS (default on supported platforms)
config ips global
    set cp-accel-mode advanced
end

Note: Not all IPS signatures can be CP-offloaded. Complex regex patterns and protocol decoder-dependent signatures still require CPU. The CP handles the bulk pattern pre-filtering, reducing CPU load by filtering out non-matching packets before they reach the IPS engine.

CP Load Balancing

On platforms with multiple CP chips, ensure load is distributed:

# Check CP load distribution
diagnose sys cps stats
 
# If one CP is overloaded while others are idle, check SSL profile
# distribution and ensure traffic is not pinned to a single CP

Monitor CP After Changes

# Re-check CP utilization
diagnose sys cps stats
 
# Compare to baseline -- CP utilization should increase
# (more work moved to CP = higher CP use but lower CPU use)
# CPU utilization should decrease correspondingly
get system performance status

Expected Impact: Enabling CP offload for SSL inspection can reduce CPU utilization by 15-40% on environments with heavy HTTPS traffic, while maintaining the same inspection depth.

Step 4: Session Table Optimization

Check Current Session Table Configuration

# Current session count and maximum
diagnose sys session stat
 
# Example output:
#   misc info:    session_count=145000 setup_rate=1200
#   ...
#   session table size: 500000
 
# Check configured session table size
config system session-ttl
    show
end

Tune Session Table Size

# Increase session table size based on your platform's memory
# FortiGate 200F: default 2M, max depends on RAM
# FortiGate 600F: default 8M, max depends on RAM
# FortiGate 3000F: default 24M+
 
# Check maximum supported sessions for your platform
get hardware status
 
# Set session table size (requires reboot on some platforms)
config system session-ttl
    set default 3600
end

Session TTL Tuning Per Protocol

Default session TTLs are conservative. Reducing them frees table entries faster:

# View current TTL settings
config system session-ttl
    show
end
 
# Optimize TTLs for common protocols
config system session-ttl
    set default 3600
    config port
        # DNS -- queries are fast, reduce from default
        edit 1
            set protocol 17
            set start-port 53
            set end-port 53
            set timeout 30
        next
        # HTTP -- most pages load within seconds
        edit 2
            set protocol 6
            set start-port 80
            set end-port 80
            set timeout 600
        next
        # HTTPS -- slightly longer for complex web apps
        edit 3
            set protocol 6
            set start-port 443
            set end-port 443
            set timeout 1800
        next
        # ICMP -- ping should not hold sessions for minutes
        edit 4
            set protocol 1
            set start-port 0
            set end-port 65535
            set timeout 15
        next
        # NTP -- one-shot, reduce aggressively
        edit 5
            set protocol 17
            set start-port 123
            set end-port 123
            set timeout 15
        next
    end
end

Warning: Reducing TTLs too aggressively can cause issues with long-lived connections (database connections, SSH tunnels, video conferencing). Test with your specific application mix. Monitor for unexpected session teardowns after changing TTLs.

Session Helper Management

Session helpers (ALGs) create additional sessions for protocols like SIP, FTP, and TFTP. These consume session table entries and prevent NP offload:

# List active session helpers
config system session-helper
    show
end
 
# Disable session helpers for protocols you don't use
# Example: Disable SIP ALG if not using SIP through the FortiGate
config system session-helper
    delete <helper-id-for-sip>
end
 
# Or disable specific helpers
config system settings
    set sip-helper disable
    set sip-nat-trace disable
end

Session Pickup for HA Environments

In High Availability (HA) configurations, session pickup keeps sessions alive during failover but consumes memory:

# Check session pickup status
config system ha
    show full | grep "session-pickup"
end
 
# Enable session pickup for critical protocols only
config system ha
    set session-pickup enable
    set session-pickup-connectionless enable
    set session-pickup-delay enable
end

Asymmetric Routing Handling

Asymmetric routing (traffic entering on one interface and returning on another) can cause session mismatches and performance issues:

# Enable asymmetric routing tolerance if needed
config system settings
    set asymroute enable
end
 
# For specific interfaces with known asymmetric paths
config system interface
    edit "wan1"
        set fail-detect enable
    next
end

Warning: Enabling asymroute globally reduces security because the FortiGate cannot properly track bidirectional session state. Only enable it when asymmetric routing is unavoidable and cannot be fixed at the routing layer.

Verify Session Table Improvements

# After TTL changes, monitor session count over time
diagnose sys session stat
# Session count should be lower for the same traffic load
# (faster TTL expiry = fewer stale sessions consuming entries)
 
# Check session setup rate (should remain stable)
# If setup rate drops, TTLs may be too aggressive
diagnose sys session stat | grep setup_rate

Expected Impact: Optimized session TTLs can reduce active session count by 20-40%, freeing table capacity and reducing memory usage proportionally.

Step 5: UTM Profile Optimization

IPS Sensor Optimization

IPS is typically the second-most CPU-intensive feature (after SSL inspection). Optimize by reducing the signature set to what is relevant:

# Check current IPS sensor configuration
config ips sensor
    edit "high-performance"
        show
    next
end
 
# Create an optimized IPS sensor that targets your actual environment
config ips sensor
    edit "optimized-sensor"
        # Start with severity-based filtering
        config entries
            # Block critical and high severity -- these are the real threats
            edit 1
                set severity critical high
                set status enable
                set action block
                set log enable
            next
            # Log medium severity for visibility, but pass traffic
            edit 2
                set severity medium
                set status enable
                set action pass
                set log enable
            next
            # Disable low/info signatures -- they generate noise and CPU load
            edit 3
                set severity low info
                set status disable
            next
        end
    next
end

Targeted signature filtering: Disable signatures for software you do not run:

config ips sensor
    edit "optimized-sensor"
        config entries
            # Disable signatures for platforms not in your environment
            edit 4
                set os "Linux"
                set status disable
                # Only if you have NO Linux servers -- be careful here
            next
            # Disable client-side signatures if this is a server farm
            edit 5
                set application "Web_Client"
                set status disable
            next
        end
    next
end

Warning: Disabling IPS signatures creates blind spots. Only disable signatures for platforms and applications that genuinely do not exist in your environment. Maintain a quarterly review of disabled signatures against your actual asset inventory.

Antivirus Scanning Mode

The antivirus scanning mode has a massive impact on performance:

# Check current AV profile
config antivirus profile
    edit "optimized-av"
        show
    next
end
 
# Flow-based scanning is 5-10x faster than proxy-based
config antivirus profile
    edit "optimized-av"
        set scan-mode quick
        set inspection-mode flow-based
        config http
            set av-scan enable
            set outbreak-prevention enable
        end
        config ftp
            set av-scan enable
        end
        config smtp
            set av-scan enable
        end
        # Disable scanning for protocols not in use
        config imap
            set av-scan disable
        end
        config pop3
            set av-scan disable
        end
    next
end

AV Mode	Throughput Impact	Detection Capability
Flow-based (quick scan)	Lowest impact, ~5% throughput reduction	Catches known malware, fast pattern matching
Flow-based (full scan)	Moderate impact, ~15% throughput reduction	Better detection, reassembles packets
Proxy-based	Highest impact, ~40-60% throughput reduction	Full file reconstruction, best detection

Web Filter Performance Tuning

# Use FortiGuard category-based filtering (faster than URL-by-URL)
config webfilter profile
    edit "optimized-webfilter"
        set inspection-mode flow-based
        config ftgd-wf
            # Block categories by group rather than individual URLs
            config filters
                edit 1
                    set category 1
                    set action block
                next
                # ... configure category blocks ...
            end
        end
        # Disable URL rating for known-safe traffic (internal sites)
        set url-extraction-redirects disable
    next
end

Application Control Optimization

# Application control -- use categories, not individual signatures
config application list
    edit "optimized-appctrl"
        config entries
            # Block high-risk categories
            edit 1
                set category 22  # P2P
                set action block
            next
            edit 2
                set category 19  # Proxy
                set action block
            next
            # Monitor (not block) general categories to reduce inspection depth
            edit 3
                set category 5  # General.Interest
                set action pass
                set log enable
            next
        end
    next
end

SSL Deep Inspection Performance Tuning

SSL deep inspection is the single most CPU-intensive feature. Optimize aggressively:

# Check current SSL inspection profile
config firewall ssl-ssh-profile
    edit "deep-inspection"
        show full
    next
end
 
# Optimize SSL inspection
config firewall ssl-ssh-profile
    edit "optimized-ssl"
        # Exempt trusted and high-bandwidth categories from inspection
        config exempt
            edit 1
                set fortiguard-category 31  # Finance/Banking
            next
            edit 2
                set fortiguard-category 33  # Health/Medicine
            next
            # Exempt CDN and update traffic (high volume, low risk)
            edit 3
                set address "Microsoft-365" "Google-Cloud" "AWS" "Akamai"
            next
        end
        # Use certificate inspection for exempted categories
        # (still validates certs without decrypting content)
        set ssl-exemptions-log enable
 
        # Tune TLS version support
        config https
            set ports 443
            set status deep-inspection
            set client-certificate bypass
        end
    next
end

Exemption strategy for SSL inspection:

Category	Recommendation	Reason
Financial sites (banks)	Certificate inspection only	Regulatory risk, these sites have their own security
Health/medical	Certificate inspection only	HIPAA concerns with intercepting PHI
CDN/cloud updates	Exempt	High bandwidth, low threat, verified publishers
SaaS (M365, Google)	Exempt or certificate-only	Already inspected by cloud security stack
Everything else	Deep inspection	This is where threats hide

Warning: Every SSL exemption is a potential blind spot. Document all exemptions and review them quarterly. Ensure you have compensating controls (EDR, cloud-native security) for exempted traffic.

Verify UTM Optimization

# Check CPU impact before and after UTM changes
get system performance status
 
# Monitor IPS engine specifically
diagnose ips session status
 
# Check flow vs proxy session ratio
diagnose sys session stat
# Fewer proxy sessions = more efficient

Expected Impact: Moving from proxy-based to flow-based inspection and optimizing IPS signatures can improve UTM throughput by 100-300% while maintaining 85-95% of detection capability.

Step 6: Firewall Policy Optimization

Policy lookup happens for every new session. The FortiGate evaluates policies top-to-bottom until a match is found. With hundreds of policies, this lookup becomes a measurable performance cost.

Policy Ordering for Performance

Place the most frequently matched policies at the top of the policy list:

# Identify your most-hit policies
diagnose firewall iprope lookup <src-ip> <dst-ip> <src-port> <dst-port> <protocol>
 
# View policy hit counts
get firewall policy | grep -A2 "policyid"
 
# Or use the GUI: Policy & Objects > Firewall Policy > Column Settings > Enable "Hit Count"
 
# Move high-hit policies to the top
config firewall policy
    move <high-hit-policy-id> before <first-policy-id>
end

Policy Consolidation

Consolidate overlapping or similar policies to reduce the total count:

# Before: 5 policies for different servers, same inspection profile
# Policy 10: srcaddr=any, dstaddr=WebServer1, service=HTTPS, action=accept
# Policy 11: srcaddr=any, dstaddr=WebServer2, service=HTTPS, action=accept
# Policy 12: srcaddr=any, dstaddr=WebServer3, service=HTTPS, action=accept
# Policy 13: srcaddr=any, dstaddr=WebServer4, service=HTTPS, action=accept
# Policy 14: srcaddr=any, dstaddr=WebServer5, service=HTTPS, action=accept
 
# After: 1 policy using an address group
config firewall addrgrp
    edit "WebServers"
        set member "WebServer1" "WebServer2" "WebServer3" "WebServer4" "WebServer5"
    next
end
 
config firewall policy
    edit <policy-id>
        set srcaddr "all"
        set dstaddr "WebServers"
        set service "HTTPS"
        set action accept
        set utm-status enable
        set auto-asic-offload enable
    next
end

Interface-Based vs Zone-Based Policies

Zone-based policies can simplify and reduce policy count, but interface-based policies enable more granular NP offload:

# Interface-based approach (better for NP offload)
config firewall policy
    edit <policy-id>
        set srcintf "port1"
        set dstintf "port3"
        # NP can offload if port1 and port3 are on the same NP chip
    next
end
 
# Zone-based approach (fewer policies, easier management)
config system zone
    edit "LAN-zone"
        set interface "port3" "port4" "port5"
    next
end

Recommendation: Use interface-based policies for high-throughput paths where NP offload matters. Use zones for management and lower-volume traffic where simplicity is more valuable than offload.

Schedule-Based Policies

Use scheduled policies to apply heavy UTM inspection only during business hours:

# Create a business hours schedule
config firewall schedule recurring
    edit "business-hours"
        set day monday tuesday wednesday thursday friday
        set start 07:00
        set end 19:00
    next
end
 
# Apply full UTM during business hours, lighter inspection off-hours
config firewall policy
    edit <business-policy-id>
        set schedule "business-hours"
        set utm-status enable
        set ips-sensor "full-sensor"
        set av-profile "full-av"
    next
    edit <offhours-policy-id>
        set schedule "always"
        set utm-status enable
        set ips-sensor "optimized-sensor"
        set av-profile "optimized-av"
    next
end

Reduce Policy Lookup Time

# Enable policy caching (enabled by default, verify it is not disabled)
config system settings
    show full | grep "policy-auth-concurrent"
end
 
# Check policy count
get firewall policy | grep -c "policyid"
 
# If you have 500+ policies, consider:
# 1. Consolidating with address groups
# 2. Using VDOMs to segment policy tables
# 3. Removing disabled/unused policies

Step 7: SD-WAN Performance Tuning

SLA Probe Optimization

Health check probes run continuously on every SD-WAN member. Each probe consumes bandwidth and generates sessions:

# View current SLA probes
config system sdwan
    config health-check
        show
    end
end
 
# Optimize probe frequency -- default 500ms is aggressive
config system sdwan
    config health-check
        edit "wan-sla"
            # Increase interval from 500ms to 1000ms for less overhead
            set interval 1000
            # Reduce probe count (default 30 is high)
            set probe-count 10
            # Use appropriate protocol
            set protocol ping
            set server "8.8.8.8" "1.1.1.1"
            # Set failure threshold
            set failtime 3
            # Set recovery threshold
            set recoverytime 5
        next
    end
end

Warning: Increasing probe intervals slows down failover detection. A 1000ms interval with failtime=3 means 3 seconds minimum to detect a link failure. For real-time applications (VoIP, video), keep intervals at 500ms. For general traffic, 1000-2000ms is acceptable.

Performance SLA Thresholds

Set realistic SLA thresholds that trigger path changes without causing flaps:

config system sdwan
    config health-check
        edit "wan-sla"
            config sla
                edit 1
                    # Latency threshold in ms
                    set latency-threshold 100
                    # Jitter threshold in ms
                    set jitter-threshold 30
                    # Packet loss threshold in percent
                    set packetloss-threshold 2
                next
                edit 2
                    # Degraded SLA -- still usable but not preferred
                    set latency-threshold 200
                    set jitter-threshold 50
                    set packetloss-threshold 5
                next
            end
        next
    end
end

Load Balancing Algorithm Selection

Choose the right algorithm for your traffic pattern:

config system sdwan
    config service
        edit 1
            set name "critical-apps"
            # source-ip-based: consistent hashing, same source always same path
            # Best for: most environments, session persistence
            set load-balance-mode source-ip-based
 
            # volume-based: distribute by bandwidth consumption
            # Best for: environments with mixed large/small flows
            # set load-balance-mode volume-based
 
            # session-based: round-robin per session
            # Best for: many small sessions, maximum distribution
            # set load-balance-mode sessions
 
            # source-dest-ip-based: hash on both src and dst
            # Best for: environments needing session affinity
            # set load-balance-mode source-dest-ip-based
        next
    end
end

Algorithm	Best For	Overhead
source-ip-based	General use, session persistence	Low
source-dest-ip-based	Asymmetric traffic avoidance	Low
sessions	Maximum path utilization	Medium
volume-based	Balanced bandwidth distribution	Medium
measured-volume-based	Dynamic bandwidth-aware balancing	Higher

SD-WAN Rule Ordering

Like firewall policies, SD-WAN rules are evaluated in order. Place the most specific and most-hit rules first:

config system sdwan
    config service
        # Rule 1: VoIP -- most latency sensitive, most specific
        edit 1
            set name "voip"
            set dst "voip-servers"
            set internet-service enable
            set internet-service-app-ctrl 41468  # Microsoft Teams
            set priority-members 1  # Best latency link
            set sla "wan-sla"
            set sla-compare-method number
        next
        # Rule 2: Critical SaaS
        edit 2
            set name "saas-critical"
            set internet-service enable
            set internet-service-app-ctrl 16354 16355  # M365
            set load-balance-mode source-ip-based
        next
        # Rule 3: General traffic (catch-all)
        edit 3
            set name "default"
            set dst "all"
            set load-balance-mode source-ip-based
        next
    end
end

Verify SD-WAN Optimization

# Check SD-WAN link quality
diagnose sys sdwan health-check
 
# Verify traffic distribution
diagnose sys sdwan service
 
# Monitor SLA status
diagnose sys sdwan intf-sla-log
 
# Check for unnecessary failovers (flapping)
diagnose sys sdwan log

Expected Impact: Optimized SLA probes and rule ordering reduce SD-WAN overhead by 10-20% of WAN bandwidth consumed by probes, and improve failover time predictability.

Step 8: Memory & Conserve Mode Prevention

Understanding Conserve Mode Triggers

# Check current memory status and thresholds
diagnose hardware sysinfo memory
 
# View conserve mode history
diagnose sys top 1 20
 
# Check if currently in conserve mode
get system performance status | grep "Memory"
# Look for: "Memory states: normal" (good) vs "Memory states: conserve" (bad)
 
# View conserve mode thresholds
config system global
    show full | grep conserve
end

Conserve Mode Thresholds

FortiGate enters conserve mode based on memory thresholds:

Threshold	Default	Behavior
Green (normal)	< 82% memory used	Normal operation
Red (conserve)	> 88% memory used	Conserve mode enters -- new sessions may be dropped
Extreme (kernel conserve)	> 95% memory used	Kernel-level session dropping, proxy daemons stopped

Tune Conserve Mode Thresholds

# Adjust thresholds (use with caution)
config system global
    # Memory watermark thresholds
    set av-failopen pass
    set ips-affinity "0"
end
 
# Set proxy resource limits to prevent proxy from consuming all memory
config firewall profile-protocol-options
    edit "optimized"
        config http
            # Limit oversized HTTP content (prevents memory exhaustion)
            set oversize-limit 10
            set uncompressed-oversize-limit 12
        end
        config ftp
            set oversize-limit 10
            set uncompressed-oversize-limit 12
        end
    next
end

Memory Usage Monitoring

# Real-time memory breakdown
diagnose sys top 1 30
 
# Check which processes consume the most memory
diagnose sys process list | sort -rn -k 4 | head 20
 
# Monitor proxy memory usage specifically
diagnose test application wad 1
 
# Session memory usage
diagnose sys session stat | grep "memory"
 
# Check for memory leaks (increasing usage without corresponding session increase)
# Run periodically and compare:
diagnose hardware sysinfo memory

Proxy Memory Limit Configuration

# Limit memory available to proxy daemons
config system global
    # Set proxy memory limit as percentage of total memory
    set proxy-resource-mode enable
end
 
# Tune individual proxy limits
config firewall profile-protocol-options
    edit "optimized"
        config http
            set oversize-limit 10
            set uncompressed-oversize-limit 12
            # Block oversized content rather than buffering it
            set block-page-status-code 403
        end
    next
end

Conserve Mode Prevention Checklist

Monitor memory continuously -- set SNMP traps at 75% memory usage
Reduce session TTLs (see Step 4) -- fewer stale sessions = less memory
Use flow-based inspection where possible -- proxy-based buffers content in memory
Limit oversized content -- a single 2 GB download in proxy mode can consume significant memory
Size your FortiGate correctly -- if you are consistently above 70% memory, you need a bigger box

# Set SNMP memory threshold trap
config system snmp sysinfo
    set status enable
end
config system snmp community
    edit 1
        set events cpu-high mem-low
    next
end

Expected Impact: Proper memory management and proxy resource limits can prevent conserve mode entirely. The goal is to keep steady-state memory usage below 70%.

Step 9: Interface & Routing Optimization

Network interface configuration and routing efficiency directly impact throughput and latency.

Interface MTU Optimization

# Check current MTU on all interfaces
get system interface | grep mtu
 
# Set optimal MTU (typically 1500 for Ethernet, 9000 for jumbo frames)
config system interface
    edit "port1"
        # Standard Ethernet MTU
        set mtu 1500
        # Override default MTU if path supports jumbo frames
        set mtu-override enable
        # For data center internal traffic:
        # set mtu 9000
    next
end

Warning: Setting MTU to 9000 (jumbo frames) requires that every device in the path supports jumbo frames. A single device with standard 1500 MTU will cause fragmentation, which is worse than using 1500 everywhere.

TCP MSS Clamping

Prevent fragmentation on VPN and encapsulated traffic:

# Set TCP MSS clamping on VPN interfaces
config vpn ipsec phase1-interface
    edit "to-azure"
        set auto-negotiate enable
    next
end
 
# Per-policy MSS clamping
config firewall policy
    edit <vpn-policy-id>
        set tcp-mss-sender 1350
        set tcp-mss-receiver 1350
    next
end

Flow Control Settings

# Check interface flow control
get system interface physical | grep flow
 
# Enable flow control to prevent packet drops at ingress
config system interface
    edit "port1"
        set speed auto
        set pause-meter-rate 0
    next
end
 
# For high-throughput interfaces, verify duplex and speed
config system interface
    edit "port1"
        show full | grep speed
        # Ensure auto-negotiation is working or force speed if needed
        # set speed 10000full  # Force 10G full duplex
    next
end

Link Aggregation (LACP)

Aggregate multiple physical links for increased bandwidth and redundancy:

# Create an LACP aggregate interface
config system interface
    edit "agg1"
        set type aggregate
        set member "port1" "port2"
        set lacp-mode active
        set lacp-ha-slave disable
        set min-links 1
        set algorithm L3  # Hash based on src/dst IP for better distribution
        # L4 for src/dst IP + port (best distribution for mixed traffic)
        # set algorithm L4
    next
end

LACP Algorithm	Hash Based On	Best For
L2	Source/Dest MAC	Single subnet traffic
L3	Source/Dest IP	Multi-subnet traffic
L4	Source/Dest IP + Port	Mixed traffic (recommended)

ECMP Routing

Equal-Cost Multi-Path routing distributes traffic across multiple next-hops:

# Configure ECMP with multiple static routes
config router static
    edit 1
        set dst 0.0.0.0/0
        set gateway 10.0.1.1
        set device "port1"
        set distance 10
    next
    edit 2
        set dst 0.0.0.0/0
        set gateway 10.0.2.1
        set device "port2"
        set distance 10
    next
end
 
# Set ECMP load balancing method
config system settings
    # source-ip-based (default, recommended)
    set ecmp-max-paths 4
end
 
# Verify ECMP is active
get router info routing-table all | grep "0.0.0.0"
# Should show multiple equal-cost routes

Routing Table Optimization

# Check routing table size
get router info routing-table details | wc -l
 
# For BGP environments, implement route aggregation
config router bgp
    config aggregate-address
        edit 1
            set prefix 10.0.0.0 255.0.0.0
        next
    end
end
 
# Set appropriate route cache timeout
config system global
    set ip-src-port-range 1024-25000
end

Step 10: Logging & Diagnostics Impact

Logging is essential for visibility and compliance, but excessive logging is a measurable performance drain. The goal is to log what matters without logging everything.

Logging Performance Impact

Each log message generated by the FortiGate consumes:

CPU cycles to format and write the log
Memory for log buffering
Disk I/O (if logging to disk)
Network bandwidth (if logging to FortiAnalyzer or syslog)

At high session rates, logging can consume 5-15% of CPU.

Disk vs Memory vs FortiAnalyzer Logging

# Check current logging configuration
show log setting
show log fortianalyzer setting
show log syslogd setting
show log disk setting
 
# Recommended: Log to FortiAnalyzer or syslog, not local disk
# Local disk logging creates I/O bottlenecks on high-throughput devices
config log disk setting
    set status disable
    set max-log-file-size 100
end
 
# FortiAnalyzer logging (recommended)
config log fortianalyzer setting
    set status enable
    set server <faz-ip>
    set enc-algorithm high
    set reliable enable
    set upload-option realtime
end
 
# If using syslog as alternative
config log syslogd setting
    set status enable
    set server <syslog-ip>
    set port 514
    set facility local7
    set format rfc5424
end

Reduce Log Volume Without Losing Visibility

# Log only denied traffic and UTM events (not all allowed traffic)
config firewall policy
    edit <high-volume-allow-policy>
        set logtraffic utm
        # Options:
        # all -- log every session (highest overhead)
        # utm -- log only UTM events (moderate overhead, good visibility)
        # disable -- no logging (lowest overhead, bad visibility)
    next
end
 
# For high-volume internal policies, consider logging only at session start
config firewall policy
    edit <internal-policy>
        set logtraffic utm
        set logtraffic-start disable
    next
end
 
# Reduce severity level for routine traffic
config log setting
    set fwpolicy-implicit-log enable
    set local-in-allow disable
    set local-in-deny-broadcast disable
    set local-in-deny-unicast disable
    set log-invalid-packet disable
end

Performance Diagnostic Commands

These commands help identify performance issues in real-time:

# Top processes by CPU
diagnose sys top 1 20
 
# Real-time throughput per interface
diagnose netlink interface list
 
# Packet flow debug (use sparingly -- this impacts performance!)
diagnose debug flow filter addr <suspect-ip>
diagnose debug flow show function-name enable
diagnose debug flow trace start 100
# IMPORTANT: Always stop the trace when done
diagnose debug flow trace stop
diagnose debug disable
 
# Crash log check (performance issues sometimes correlate with crashes)
diagnose debug crashlog read
 
# Hardware sensor readings (thermal throttling check)
execute sensor list
 
# NP error counters
diagnose npu np6 sse-stats | grep -i error
diagnose npu np6 sse-stats | grep -i drop

Warning: diagnose debug flow traces are extremely CPU-intensive. Never enable flow traces on a production device during peak hours without limiting the filter scope and packet count. Always stop traces when done. Failing to stop a debug trace is a common cause of performance degradation.

Verify Logging Optimization

# Check log generation rate
execute log filter device memory
execute log display
# Look at timestamps -- high-volume logging will show hundreds of entries per second
 
# Compare CPU before and after logging changes
get system performance status

Expected Impact: Moving from logtraffic all to logtraffic utm on high-volume policies can reduce CPU overhead by 5-15% and dramatically reduce FortiAnalyzer storage requirements.

Verification

After completing optimization steps, run a comprehensive verification to measure improvements against your baseline.

Post-Optimization Metrics Collection

# Collect the same metrics as your pre-optimization baseline
 
# System performance
get system performance status
 
# CPU breakdown
diagnose sys top 1 20
 
# Memory status
diagnose hardware sysinfo memory
 
# Session statistics
diagnose sys session stat
 
# NP offload ratio
diagnose npu np6 session-stats
 
# CP utilization
diagnose sys cps stats
 
# Interface throughput
diagnose netlink interface list
 
# Verify no conserve mode events
get system performance status | grep "Memory"

Before/After Comparison

Metric	Baseline	After Optimization	Change
CPU Usage (average)	___%	___%	-___%
Memory Usage	___%	___%	-___%
Active Sessions	___	___	-___
NP Offloaded Sessions	___%	___%	+___%
Firewall Throughput	___ Gbps	___ Gbps	+___%
CP Utilization	___%	___%	+___%
Policy Count	___	___	-___

Ongoing Monitoring

Set up persistent monitoring to track performance over time:

# Configure SNMP monitoring for CPU/memory thresholds
config system snmp sysinfo
    set status enable
    set engine-id "local"
    set contact-info "noc@yourcompany.com"
end
 
config system snmp community
    edit 1
        set name "<community-string>"
        set events cpu-high mem-low log-full intf-ip fm-conf-change
        config hosts
            edit 1
                set ip <monitoring-server-ip> 255.255.255.255
            next
        end
    next
end
 
# Enable performance logging to FortiAnalyzer
config log fortianalyzer setting
    set status enable
    set upload-option realtime
end

Validate Security Posture

Performance optimization must not compromise security. Verify:

# Confirm IPS is still active and detecting
diagnose ips session status
 
# Verify AV engine is running and updating
diagnose autoupdate status
diagnose autoupdate versions
 
# Check SSL inspection is functional
diagnose test application ssl 1
 
# Verify firewall policies are still matching correctly
diagnose firewall iprope lookup <test-src-ip> <test-dst-ip> <port> <port> <proto>

Troubleshooting

NP Offload Failures

Symptom: NP offloaded session count is low despite eligible traffic.

# Check NP health
diagnose npu np6 sse-stats
# Look for error counters, session table full, or hardware errors
 
# Verify interface-to-NP mapping
diagnose npu np6 port-list
# Ensure ingress and egress interfaces are on the same NP
 
# Check for NP session table exhaustion
diagnose npu np6 session-stats | grep "max"
# If current count is near max, the NP session table is full
 
# Check policy settings
config firewall policy
    edit <policy-id>
        show full | grep auto-asic-offload
    next
end
# Ensure auto-asic-offload is "enable"

Resolution:

Verify interfaces are on the same NP chip
Enable auto-asic-offload on applicable policies
Move to flow-based inspection (proxy mode prevents NP offload)
Disable unnecessary session helpers/ALGs
Check for firmware bugs in your FortiOS version release notes

Conserve Mode

Symptom: FortiGate enters conserve mode, new sessions dropped, users report connectivity failures.

# Check current memory state
get system performance status
 
# Identify memory consumers
diagnose sys top 1 30
# Look for processes using >20% memory: wad (proxy), ipsengine, scanunitd
 
# Check session count (excessive sessions consume memory)
diagnose sys session stat
 
# Check for memory leaks
diagnose hardware sysinfo memory
# Run multiple times over 10 minutes -- if usage only goes up, leak suspected

Resolution:

Reduce session TTLs to free stale sessions
Switch from proxy-mode to flow-mode inspection
Limit oversized content buffering in protocol options
Add memory (if hardware supports DIMM upgrade)
If recurring, right-size to a larger FortiGate model
As emergency action: diagnose sys session clear (clears ALL sessions -- use only in emergencies)

Warning: diagnose sys session clear drops every active session on the FortiGate, including management sessions. You will be disconnected. All users will experience a brief outage as sessions re-establish. Only use this as a last resort to exit conserve mode.

Session Table Exhaustion

Symptom: New sessions are dropped even though CPU and memory appear normal.

# Check session count vs maximum
diagnose sys session stat
# If session_count is at or near the maximum, the table is full
 
# Identify what is filling the session table
diagnose sys session list | awk '{print $4}' | sort | uniq -c | sort -rn | head 20
# This shows source IPs with the most sessions -- could indicate a scan or worm
 
# Check for session table DoS
diagnose sys session stat | grep "clash"
# High clash count indicates hash table collisions (possible DoS)

Resolution:

Reduce session TTLs (see Step 4)
Identify and block source IPs with excessive sessions (possible compromise or misconfiguration)
Increase session table size if hardware supports it
Enable DoS policy to limit per-source session counts:

config firewall DoS-policy
    edit 1
        set interface "wan1"
        config anomaly
            edit "tcp_src_session"
                set status enable
                set action block
                set threshold 1000
            next
        end
    next
end

Asymmetric Routing Issues

Symptom: Sessions are being dropped or logged as denied, but the firewall policy should allow them. Often seen with ECMP, multiple ISPs, or HA configurations.

# Check for asymmetric routing indicators in logs
execute log filter field subtype forward
execute log filter field action deny
execute log display
 
# Look for "reverse path check fail" or "no matching policy" for traffic
# that should be allowed
 
# Enable loose RPF (Reverse Path Forwarding) check
config system settings
    set strict-src-check disable
end
 
# Or enable asymmetric routing
config system settings
    set asymroute enable
end

Resolution:

Fix routing to be symmetric where possible (preferred)
Enable asymroute only when asymmetric paths are unavoidable
For ECMP: ensure both paths return via the same FortiGate (or use HA)
For multi-ISP: use policy routing to ensure return traffic matches the ingress path

SD-WAN Path Flapping

Symptom: SD-WAN continuously switches between paths, causing session disruptions.

# Check SLA status history
diagnose sys sdwan health-check
 
# Look for rapid state changes
diagnose sys sdwan intf-sla-log
 
# Check if thresholds are too tight
config system sdwan
    config health-check
        edit "wan-sla"
            show
        next
    end
end

Resolution:

Increase probe intervals to smooth out momentary blips
Widen SLA thresholds (e.g., latency from 50ms to 100ms)
Increase failtime and recoverytime to require sustained failure/recovery before switching
Use hold-down-time to prevent rapid re-switching:

config system sdwan
    config health-check
        edit "wan-sla"
            set failtime 5
            set recoverytime 10
            set interval 1000
        next
    end
end

High CPU from IPS Engine

Symptom: ipsengine process consuming 60%+ CPU.

# Check IPS engine status
diagnose ips session status
 
# Identify which signatures are firing most
diagnose ips anomaly list
 
# Check if IPS database is current
diagnose autoupdate versions | grep -A3 "IPS"

Resolution:

Optimize IPS sensor (see Step 5) -- remove irrelevant signatures
Enable CP offload for IPS pattern matching
Switch from proxy-based to flow-based IPS inspection
If a specific signature is firing excessively on legitimate traffic, create an IPS exemption:

config ips sensor
    edit "optimized-sensor"
        config entries
            edit <entry-id>
                config exempt-ip
                    edit 1
                        set src-ip <legitimate-source>
                    next
                end
            next
        end
    next
end

Final Note: Performance optimization is not a one-time task. Traffic patterns change, new applications are deployed, firmware updates alter behavior, and session loads grow. Schedule quarterly performance reviews using the baseline methodology documented in this guide. Compare current metrics to your original baseline and to the post-optimization measurements. Adjust as needed. The FortiGate that was optimized for today's traffic may need re-tuning for next quarter's growth.

Prerequisites

Overview

FortiGate ASIC Architecture

Traffic Flow Through FortiGate

Scenario

Pre-Optimization Assessment

System-Level Metrics

NP (Network Processor) Counters

Interface Throughput

Content Processor Metrics

Record Your Baseline

Step 1: Understanding the Traffic Flow

The Session Lifecycle

What Qualifies for NP Offload

What Disqualifies NP Offload

Verify Current Offload Status

Step 2: NP (Network Processor) Optimization

Verify NP Offloading is Enabled Globally

Enable NP Offload on Specific Policies

Identify Non-Offloaded Sessions and Root Cause

NP6 vs NP7 Specific Tuning

NP Interface Mapping Optimization

Verify NP Optimization Improvements

Step 3: CP (Content Processor) Optimization

Verify CP Status and Utilization

CP Offload for SSL Deep Inspection

IPS CP Offload

CP Load Balancing

Monitor CP After Changes

Step 4: Session Table Optimization

Check Current Session Table Configuration

Tune Session Table Size

Session TTL Tuning Per Protocol

Session Helper Management

Session Pickup for HA Environments

Asymmetric Routing Handling

Verify Session Table Improvements

Step 5: UTM Profile Optimization

IPS Sensor Optimization

Antivirus Scanning Mode

Web Filter Performance Tuning

Application Control Optimization

SSL Deep Inspection Performance Tuning

Verify UTM Optimization

Step 6: Firewall Policy Optimization

Policy Ordering for Performance

Policy Consolidation

Interface-Based vs Zone-Based Policies

Schedule-Based Policies

Reduce Policy Lookup Time

Step 7: SD-WAN Performance Tuning

SLA Probe Optimization

Performance SLA Thresholds

Load Balancing Algorithm Selection

SD-WAN Rule Ordering

Verify SD-WAN Optimization

Step 8: Memory & Conserve Mode Prevention

Understanding Conserve Mode Triggers

Conserve Mode Thresholds

Tune Conserve Mode Thresholds

Memory Usage Monitoring

Proxy Memory Limit Configuration

Conserve Mode Prevention Checklist

Step 9: Interface & Routing Optimization

Interface MTU Optimization

TCP MSS Clamping

Flow Control Settings

Link Aggregation (LACP)

ECMP Routing

Routing Table Optimization

Step 10: Logging & Diagnostics Impact

Logging Performance Impact

Disk vs Memory vs FortiAnalyzer Logging

Reduce Log Volume Without Losing Visibility

Performance Diagnostic Commands

Verify Logging Optimization

Verification

Post-Optimization Metrics Collection

Before/After Comparison

Ongoing Monitoring