diff options
author | Ben Hutchings <benh@debian.org> | 2022-01-02 03:15:50 +0100 |
---|---|---|
committer | Ben Hutchings <benh@debian.org> | 2022-01-02 03:17:24 +0100 |
commit | 6ee28b063993a2cf72f63f4b6c48a6324c7df182 (patch) | |
tree | c45ef2c84dd1415cd7e7a8473cd4d67f519957cc | |
parent | 435596fd653fe61a6f04671f8875146484793048 (diff) | |
parent | ffcdf42eb1f9451137000eb48b4e7427b44344de (diff) | |
download | linux-debian-6ee28b063993a2cf72f63f4b6c48a6324c7df182.tar.gz |
Merge tag 'debian/5.15.5-2' into bullseye-backports
Release linux (5.15.5-2).
503 files changed, 15077 insertions, 37490 deletions
diff --git a/debian/.gitignore b/debian/.gitignore index 8f1918f3e..c328c7275 100644 --- a/debian/.gitignore +++ b/debian/.gitignore @@ -24,6 +24,7 @@ /files /linux-* /rules.gen +/tests/control # Ignore compiled Python files __pycache__/ diff --git a/debian/changelog b/debian/changelog index 176f71dd6..9880f6d5f 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,3 +1,1148 @@ +linux (5.15.5-2~bpo11+1) UNRELEASED; urgency=medium + + * Rebuild for bullseye-backports: + - Change ABI number to 0.bpo.2 + + -- Ben Hutchings <benh@debian.org> Sun, 02 Jan 2022 03:15:12 +0100 + +linux (5.15.5-2) unstable; urgency=medium + + * atlantic: Fix OOB read and write in hw_atl_utils_fw_rpc_wait + (CVE-2021-43975) + * fget: check that the fd still exists after getting a ref to it + (CVE-2021-4083) + * USB: gadget: detect too-big endpoint 0 requests (CVE-2021-39685) + * USB: gadget: zero allocate endpoint 0 buffers (CVE-2021-39685) + * [x86] Revert "drm/i915: Implement Wa_1508744258" (Closes: #1001128) + * nfsd: fix use-after-free due to delegation race (Closes: #988044) + * bpf: Fix kernel address leakage in atomic fetch + * bpf: Fix signed bounds propagation after mov32 + * bpf: Make 32->64 bounds propagation slightly more robust + * bpf: Fix kernel address leakage in atomic cmpxchg's r0 aux reg + + -- Salvatore Bonaccorso <carnil@debian.org> Sun, 19 Dec 2021 00:20:10 +0100 + +linux (5.15.5-1) unstable; urgency=medium + + * New upstream stable update: + https://www.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.15.4 + - string: uninline memcpy_and_pad + - [x86] KVM: Fix steal time asm constraints + - btrfs: introduce btrfs_is_data_reloc_root + - btrfs: zoned: add a dedicated data relocation block group + - btrfs: zoned: only allow one process to add pages to a relocation inode + - btrfs: zoned: use regular writes for relocation + - btrfs: check for relocation inodes on zoned btrfs in should_nocow + - btrfs: zoned: allow preallocation for relocation inodes + - block: Add a helper to validate the block size + - loop: Use blk_validate_block_size() to validate block size + - Bluetooth: btusb: Add support for TP-Link UB500 Adapter + - PCI/MSI: Deal with devices lying about their MSI mask capability + - PCI: Add MSI masking quirk for Nvidia ION AHCI + - perf/core: Avoid put_page() when GUP fails + - thermal: Fix NULL pointer dereferences in of_thermal_ functions + - Revert "ACPI: scan: Release PM resources blocked by unused objects" + https://www.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.15.5 + - [arm64] zynqmp: Do not duplicate flash partition label property + - [arm64] zynqmp: Fix serial compatible string + - [arm64,armhf] clk: sunxi-ng: Unregister clocks/resets when unbinding + - scsi: pm80xx: Fix memory leak during rmmod + - scsi: lpfc: Fix list_add() corruption in lpfc_drain_txq() + - [armhf] bus: ti-sysc: Add quirk handling for reinit on context lost + - [armhf] bus: ti-sysc: Use context lost quirk for otg + - [armhf] usb: musb: tusb6010: check return value after calling + platform_get_resource() + - [x86] usb: typec: tipd: Remove WARN_ON in tps6598x_block_read + - staging: rtl8723bs: remove possible deadlock when disconnect (v2) + - staging: rtl8723bs: remove a second possible deadlock + - staging: rtl8723bs: remove a third possible deadlock + - [arm64] dts: ls1012a: Add serial alias for ls1012a-rdb + - RDMA/rxe: Separate HW and SW l/rkeys + - [x86] ASoC: SOF: Intel: hda-dai: fix potential locking issue + - scsi: core: Fix scsi_mode_sense() buffer length handling + - ALSA: usb-audio: disable implicit feedback sync for Behringer UFX1204 and + UFX1604 + - [armhf] clk: imx: imx6ul: Move csi_sel mux to correct base register + - ASoC: es8316: Use IRQF_NO_AUTOEN when requesting the IRQ + - [x86] ASoC: rt5651: Use IRQF_NO_AUTOEN when requesting the IRQ + - [x86] ASoC: nau8824: Add DMI quirk mechanism for active-high jack-detect + - scsi: advansys: Fix kernel pointer leak + - scsi: smartpqi: Add controller handshake during kdump + - [arm64] dts: imx8mm-kontron: Fix reset delays for ethernet PHY + - ALSA: intel-dsp-config: add quirk for APL/GLK/TGL devices based on ES8336 + codec + - [x86] ASoC: Intel: soc-acpi: add missing quirk for TGL SDCA single amp + - [x86] ASoC: Intel: sof_sdw: add missing quirk for Dell SKU 0A45 + - firmware_loader: fix pre-allocated buf built-in firmware use + - HID: multitouch: disable sticky fingers for UPERFECT Y + - ALSA: usb-audio: Add support for the Pioneer DJM 750MK2 Mixer/Soundcard + - ASoC: rt5682: fix a little pop while playback + - [amd64] iommu/vt-d: Do not falsely log intel_iommu is unsupported kernel + option + - tty: tty_buffer: Fix the softlockup issue in flush_to_ldisc + - scsi: scsi_debug: Fix out-of-bound read in resp_readcap16() + - scsi: scsi_debug: Fix out-of-bound read in resp_report_tgtpgs() + - scsi: target: Fix ordered tag handling + - scsi: target: Fix alua_tg_pt_gps_count tracking + - iio: imu: st_lsm6dsx: Avoid potential array overflow in + st_lsm6dsx_set_odr() + - RDMA/core: Use kvzalloc when allocating the struct ib_port + - scsi: lpfc: Fix use-after-free in lpfc_unreg_rpi() routine + - scsi: lpfc: Fix link down processing to address NULL pointer dereference + - scsi: lpfc: Allow fabric node recovery if recovery is in progress before + devloss + - [i386] ALSA: gus: fix null pointer dereference on pointer block + - ALSA: usb-audio: fix null pointer dereference on pointer cs_desc + - f2fs: fix up f2fs_lookup tracepoints + - f2fs: fix to use WHINT_MODE + - f2fs: fix wrong condition to trigger background checkpoint correctly + - f2fs: compress: disallow disabling compress on non-empty compressed file + - f2fs: fix incorrect return value in f2fs_sanity_check_ckpt() + - [armhf] clk/ast2600: Fix soc revision for AHB + - [arm64] clk: qcom: gcc-msm8996: Drop (again) gcc_aggre1_pnoc_ahb_clk + - [arm64] KVM: arm64: Fix host stage-2 finalization + - sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain() + - sched/fair: Prevent dead task groups from regaining cfs_rq's + - [x86] perf/x86/vlbr: Add c->flags to vlbr event constraints + - blkcg: Remove extra blkcg_bio_issue_init + - drm/nouveau: hdmigv100.c: fix corrupted HDMI Vendor InfoFrame + - bpf: Fix inner map state pruning regression. + - tcp: Fix uninitialized access in skb frags array for Rx 0cp. + - tracing: Add length protection to histogram string copies + - nl80211: fix radio statistics in survey dump + - mac80211: fix monitor_sdata RCU/locking assertions + - net: bnx2x: fix variable dereferenced before check + - bnxt_en: reject indirect blk offload when hw-tc-offload is off + - tipc: only accept encrypted MSG_CRYPTO msgs + - sock: fix /proc/net/sockstat underflow in sk_clone_lock() + - net/smc: Make sure the link_id is unique + - NFSD: Fix exposure in nfsd4_decode_bitmap() + - iavf: Fix return of set the new channel count + - iavf: check for null in iavf_fix_features + - iavf: free q_vectors before queues in iavf_disable_vf + - iavf: don't clear a lock we don't hold + - iavf: Fix failure to exit out from last all-multicast mode + - iavf: prevent accidental free of filter structure + - iavf: validate pointers + - iavf: Fix for the false positive ASQ/ARQ errors while issuing VF reset + - iavf: Fix for setting queues to 0 + - iavf: Restore VLAN filters after link down + - bpf: Fix toctou on read-only map's constant scalar tracking + (CVE-2021-4001) + - [x86] platform/x86: hp_accel: Fix an error handling path in + 'lis3lv02d_probe()' + - udp: Validate checksum in udp_read_sock() + - btrfs: make 1-bit bit-fields of scrub_page unsigned int + - RDMA/core: Set send and receive CQ before forwarding to the driver + - net/mlx5e: Wait for concurrent flow deletion during neigh/fib events + - net/mlx5: E-Switch, Fix resetting of encap mode when entering switchdev + - net/mlx5e: nullify cq->dbg pointer in mlx5_debug_cq_remove() + - net/mlx5: Update error handler for UCTX and UMEM + - net/mlx5: E-Switch, rebuild lag only when needed + - net/mlx5e: CT, Fix multiple allocations and memleak of mod acts + - net/mlx5: Lag, update tracker when state change event received + - net/mlx5: E-Switch, return error if encap isn't supported + - scsi: ufs: core: Improve SCSI abort handling + - scsi: core: sysfs: Fix hang when device state is set via sysfs + - scsi: ufs: core: Fix task management completion timeout race + - scsi: ufs: core: Fix another task management completion race + - [arm*] net: mvmdio: fix compilation warning + - net: sched: act_mirred: drop dst for the direction from egress to ingress + - [arm64] net: dpaa2-eth: fix use-after-free in dpaa2_eth_remove + - net: virtio_net_hdr_to_skb: count transport header in UFO + - i40e: Fix correct max_pkt_size on VF RX queue + - i40e: Fix NULL ptr dereference on VSI filter sync + - i40e: Fix changing previously set num_queue_pairs for PFs + - i40e: Fix ping is lost after configuring ADq on VF + - RDMA/mlx4: Do not fail the registration on port stats + - i40e: Fix warning message and call stack during rmmod i40e driver + - i40e: Fix creation of first queue by omitting it if is not power of two + - i40e: Fix display error code in dmesg + - e100: fix device suspend/resume (Closes: #995927) + - [powerpc*] KVM: PPC: Book3S HV: Use GLOBAL_TOC for + kvmppc_h_set_dabr/xdabr() + - [powerpc*] pseries: rename numa_dist_table to form2_distances + - [powerpc*] pseries: Fix numa FORM2 parsing fallback code + - [x86] perf/x86/intel/uncore: Fix filter_tid mask for CHA events on Skylake + Server + - [x86] perf/x86/intel/uncore: Fix IIO event constraints for Skylake Server + - [x86] perf/x86/intel/uncore: Fix IIO event constraints for Snowridge + - [s390x] kexec: fix return code handling + - blk-cgroup: fix missing put device in error path from blkg_conf_pref() + - tun: fix bonding active backup with arp monitoring + - tipc: check for null after calling kmemdup + - ipc: WARN if trying to remove ipc object which is absent + - shm: extend forced shm destroy to support objects from several IPC nses + - hugetlb, userfaultfd: fix reservation restore on userfaultfd error + - [x86] boot: Pull up cmdline preparation and early param parsing + - [x86] hyperv: Fix NULL deref in set_hv_tscchange_cb() if Hyper-V setup + fails + - [x86] KVM: x86: Assume a 64-bit hypercall for guests with protected state + - [x86] KVM: x86: Fix uninitialized eoi_exit_bitmap usage in + vcpu_load_eoi_exitmap() + - [x86] KVM: x86/mmu: include EFER.LMA in extended mmu role + - [x86] KVM: x86/xen: Fix get_attr of KVM_XEN_ATTR_TYPE_SHARED_INFO + - [powerpc*] xive: Change IRQ domain to a tree domain + - [x86] Revert "drm/i915/tgl/dsi: Gate the ddi clocks after pll mapping" + - ata: libata: improve ata_read_log_page() error message + - ata: libata: add missing ata_identify_page_supported() calls + - scsi: qla2xxx: Fix mailbox direction flags in qla2xxx_get_adapter_id() + - [s390x] setup: avoid reserving memory above identity mapping + - [s390x] boot: simplify and fix kernel memory layout setup + - [s390x] vdso: filter out -mstack-guard and -mstack-size + - [s390x] dump: fix copying to user-space of swapped kdump oldmem + - block: Check ADMIN before NICE for IOPRIO_CLASS_RT + - fbdev: Prevent probing generic drivers if a FB is already registered + - [x86] KVM: SEV: Disallow COPY_ENC_CONTEXT_FROM if target has created vCPUs + - [x86] KVM: nVMX: don't use vcpu->arch.efer when checking host state on + nested state load + - drm/cma-helper: Release non-coherent memory with dma_free_noncoherent() + - printk: restore flushing of NMI buffers on remote CPUs after NMI + backtraces + - udf: Fix crash after seekdir + - spi: fix use-after-free of the add_lock mutex + - [armhf] net: stmmac: socfpga: add runtime suspend/resume callback for + stratix10 platform + - [x86] Drivers: hv: balloon: Use VMBUS_RING_SIZE() wrapper for dm_ring_size + - btrfs: fix memory ordering between normal and ordered work functions + - fs: handle circular mappings correctly + - net: stmmac: Fix signed/unsigned wreckage + - cfg80211: call cfg80211_stop_ap when switch from P2P_GO type + - mac80211: drop check for DONT_REORDER in __ieee80211_select_queue + - drm/amd/display: Update swizzle mode enums + - drm/amd/display: Limit max DSC target bpp for specific monitors + - [x86] drm/i915/guc: Fix outstanding G2H accounting + - [x86] drm/i915/guc: Don't enable scheduling on a banned context, guc_id + invalid, not registered + - [x86] drm/i915/guc: Workaround reset G2H is received after schedule done + G2H + - [x86] drm/i915/guc: Don't drop ce->guc_active.lock when unwinding context + - [x86] drm/i915/guc: Unwind context requests in reverse order + - drm/udl: fix control-message timeout + - drm/prime: Fix use after free in mmap with drm_gem_ttm_mmap + - drm/nouveau: Add a dedicated mutex for the clients list (CVE-2020-27820) + - drm/nouveau: use drm_dev_unplug() during device removal (CVE-2020-27820) + - drm/nouveau: clean up all clients on device removal (CVE-2020-27820) + - [x86] drm/i915/dp: Ensure sink rate values are always valid + - [x86] drm/i915/dp: Ensure max link params are always valid + - [x86] drm/i915: Fix type1 DVI DP dual mode adapter heuristic for modern + platforms + - drm/amdgpu: fix set scaling mode Full/Full aspect/Center not works on vga + and dvi connectors + - drm/amd/pm: avoid duplicate powergate/ungate setting + - signal: Implement force_fatal_sig + - exit/syscall_user_dispatch: Send ordinary signals on failure + - [powerpc*] signal/powerpc: On swapcontext failure force SIGSEGV + - [s390x] signal/s390: Use force_sigsegv in default_trap_handler + - [x86] signal/x86: In emulate_vsyscall force a signal instead of calling + do_exit + - signal: Replace force_sigsegv(SIGSEGV) with force_fatal_sig(SIGSEGV) + - signal: Don't always set SA_IMMUTABLE for forced signals + - signal: Replace force_fatal_sig with force_exit_sig when in doubt + - hugetlbfs: flush TLBs correctly after huge_pmd_unshare (CVE-2021-4002) + - RDMA/netlink: Add __maybe_unused to static inline in C file + - bpf: Forbid bpf_ktime_get_coarse_ns and bpf_timer_* in tracing progs + - selinux: fix NULL-pointer dereference when hashtab allocation fails + - ASoC: DAPM: Cover regression by kctl change notification fix + - ice: Fix VF true promiscuous mode + - ice: Delete always true check of PF pointer + - fs: export an inode_update_time helper + - btrfs: update device path inode time instead of bd_inode + - net: add and use skb_unclone_keeptruesize() helper + - [x86] ALSA: hda: hdac_ext_stream: fix potential locking issues + - ALSA: hda: hdac_stream: fix potential locking issue in + snd_hdac_stream_assign() + + [ Salvatore Bonaccorso ] + * [rt] Update to 5.15.3-rt21 + * Drop "arm64: dts: rockchip: disable USB type-c DisplayPort" + * [rt] Refresh "printk: move console printing to kthreads" + * [rt] Refresh "printk: remove deferred printing" + * Bump ABI to 2 + * fuse: release pipe buf after last use (Closes: #1000504) + + -- Salvatore Bonaccorso <carnil@debian.org> Fri, 26 Nov 2021 06:33:39 +0100 + +linux (5.15.3-1) unstable; urgency=medium + + * New upstream stable update: + https://www.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.15.3 + - Bluetooth: sco: Fix lock_sock() blockage by memcpy_from_msg() + (CVE-2021-3640) + + [ Vincent Blut ] + * [arm64] sound/soc/meson: Enable SND_MESON_AXG_SOUND_CARD as module + (Closes: #999638) + * [arm64,armhf] sound/soc/meson: Enable SND_MESON_GX_SOUND_CARD as module + * drivers/bluetooth: Enable BT_HCIBTUSB_MTK (Closes: #999748) + + [ Salvatore Bonaccorso ] + * mac80211: fix radiotap header generation + * [rt] Update to 5.15.2-rt20 + * [rt] Refresh "printk: introduce kernel sync mode" + * [rt] Refresh "printk: move console printing to kthreads" + * [rt] Drop "rcutorture: Avoid problematic critical section nesting on + PREEMPT_RT" + * [rt] Drop "lockdep: Let lock_is_held_type() detect recursive read as read" + * [rt] Refresh "x86/softirq: Disable softirq stacks on PREEMPT_RT" + * [rt] Refresh "POWERPC: Allow to enable RT" + * Set ABI to 1 + + -- Salvatore Bonaccorso <carnil@debian.org> Thu, 18 Nov 2021 22:32:07 +0100 + +linux (5.15.2-1~exp1) experimental; urgency=medium + + * New upstream stable update: + https://www.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.15.2 + + [ Salvatore Bonaccorso ] + * [rt] Update to 5.15-rt17 and reenable (Closes: #995466) + * perf srcline: Use long-running addr2line per DSO (Closes: #911815) + * Refresh "Export symbols needed by Android drivers" + * [rt] Update to 5.15.2-rt19 + * Input: elantench - fix misreporting trackpoint coordinates (Closes: #989285) + * kernel/time: Enable NO_HZ_FULL (Closes: #804857) + * io-wq: serialize hash clear with wakeup (Closes: #996951) + + [ Vincent Blut ] + * [x86] drivers/ptp: Enable PTP_1588_CLOCK_VMW as module + * drivers/ptp: Enable PTP_1588_CLOCK_DTE, PTP_1588_CLOCK_IDT82P33, + PTP_1588_CLOCK_IDTCM, PTP_1588_CLOCK_OCP as modules + * drivers/ptp, net: Enable DP83640_PHY, PTP_1588_CLOCK_INES, + NET_PTP_CLASSIFY, NETWORK_PHY_TIMESTAMPING + + -- Salvatore Bonaccorso <carnil@debian.org> Sun, 14 Nov 2021 14:27:40 +0100 + +linux (5.15.1-1~exp1) experimental; urgency=medium + + * New upstream stable update: + https://www.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.15.1 + + [ Salvatore Bonaccorso ] + * [arm*] drop cc-option fallbacks for architecture selection + * net/tls: Enable TLS as module (Closes: #919807) + + [ Diederik de Haas ] + * [x86] drivers/hwmon: Enable SENSORS_CORSAIR_PSU as module + * [arm64] drivers/hwmon: Enable SENSORS_GPIO_FAN as module + + -- Salvatore Bonaccorso <carnil@debian.org> Sun, 07 Nov 2021 11:22:47 +0100 + +linux (5.15-1~exp1) experimental; urgency=medium + + * New upstream release candidate + + [ Diederik de Haas ] + * [arm*] drivers/led/trigger: Make LEDS_TRIGGER_HEARTBEAT builtin + (Closes: #992184) + * [arm64] sound/soc/codecs: Enable SND_SOC_SPDIF as module + * [armel/rpi] Enable RPi's clock framework and CPU Freq scaling + * [armel/rpi] Change default governor to 'ondemand' for RPi 0/0w/1 + (Closes: #991921) + * [arm64] sound/soc/rockchip: Enable SND_SOC_ROCKCHIP_PDM as module + * [armel] Make explicit that -rpi kernel variant is for RPi 0/0w/1, not the + others + + [ Nathan Schulte ] + * [arm64] drivers/staging/media/hantro: Enable VIDEO_HANTRO as module + * [arm64] drivers/staging/media/rkvdec: Enable VIDEO_ROCKCHIP_VDEC as module + (Closes: #993902) + + [ Vincent Blut ] + * [arm] arch/arm/crypto: Enable CRYPTO_BLAKE2S_ARM, CRYPTO_SHA256_ARM and + CRYPTO_SHA512_ARM as modules + * [armhf] arch/arm/crypto: Enable most NEON based implementation of + cryptographic algorithms as modules + * [arm] Move CRYPTO_NHPOLY1305_NEON in armhf config file + * [arm64] drivers/gpu/drm/vmwgfx: Enable DRM_VMWGFX as module + (Closes: #995276) + * [armhf] sound/soc/sunxi: Enable SND_SUN4I_I2S as module (Closes: #971892) + * [armhf] drivers/gpu/drm/bridge/synopsys: Enable DRM_DW_HDMI_I2S_AUDIO as + module + * drivers/usb/serial: Enable USB_SERIAL_XR as module (Closes: #996962) + * drivers/bus/mhi: Enable MHI_BUS, MHI_BUS_PCI_GENERIC as modules + (Closes: #995407) + * drivers/net: Enable MHI_NET as module + * drivers/net/wwan: Enable WWAN, MHI_WWAN_CTRL as modules + + [ YunQiang Su ] + * [mipsel,mips64el/loongson-3] linux-image: Recommend pmon-update + + [ Salvatore Bonaccorso ] + * Compile with gcc-11 on all architectures + * [arm64] drivers/net: Enable VMXNET3 as module + + [ Uwe Kleine-König ] + * [arm64] Enable various symbols for the librem5 devkit and iMX8MN Variscite + Symphony (Patches by Guido Günther and Ariel D'Alessandro) + * [armhf,arm64] Cherrypick fix for snvs_pwrkey to prevent a machine hang. + + [ Heiko Thiery ] + * [arm64] drivers/mtd/spi-nor: enable MTD_SPI_NOR as module + * [arm64] drivers/net/can/spi: enable CAN_MCP251X as module + * [arm64] drivers/net/phy: enable MICROSEMI_PHY as module + * [arm64] drivers/net/usb: enable USB_NET_SMSC95XX as module + + [ Ryutaroh Matsumoto ] + * [arm64] Enable TOUCHSCREEN_RASPBERRYPI_FW and + REGULATOR_RASPBERRYPI_TOUCHSCREEN_ATTINY (Closes: #977575) + + [ Ariel D'Alessandro ] + * [arm64] drivers/regulator: Enable REGULATOR_BD718XX as module + + [ Lubomir Rintel ] + * [armhf] Add support for Marvell MMP3 + * [armhf] Enable SND_MMP_SOC_SSPA, COMMON_CLK_MMP2_AUDIO, PHY_MMP3_USB, + MFD_ENE_KB3930 and LEDS_ARIEL as modules. + + [ Sean McAvoy ] + * [armel] marvell: Enable CONFIG_SENSORS_LM63 as a module. + + [ Dan Stefura ] + * [arm64] enable i6300esb watchdog kernel module + + [ Thore Sommer ] + * drivers/md: Enable DM_VERITY_FEC + + [ Aurelien Jarno ] + * [riscv64] Enable NUMA (Closes: #993453) + + -- Bastian Blank <bastian.blank@credativ.de> Thu, 04 Nov 2021 09:01:01 +0100 + +linux (5.14.16-1) unstable; urgency=medium + + * New upstream stable update: + https://www.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.14.13 + - ext4: check and update i_disksize properly + - ext4: correct the error path of ext4_write_inline_data_end() + - [x86] ASoC: Intel: sof_sdw: tag SoundWire BEs as non-atomic + - ALSA: oxfw: fix transmission method for Loud models based on OXFW971 + - ALSA: usb-audio: Unify mixer resume and reset_resume procedure + - HID: apple: Fix logical maximum and usage maximum of Magic Keyboard JIS + - netfilter: ip6_tables: zero-initialize fragment offset + - HID: wacom: Add new Intuos BT (CTL-4100WL/CTL-6100WL) device IDs + - [x86] ASoC: SOF: loader: release_firmware() on load failure to avoid + batching + - netfilter: nf_nat_masquerade: make async masq_inet6_event handling generic + - netfilter: nf_nat_masquerade: defer conntrack walk to work queue + - mac80211: Drop frames from invalid MAC address in ad-hoc mode + - [m68k] Handle arrivals of multiple signals correctly + - net: prevent user from passing illegal stab size + - mac80211: check return value of rhashtable_init + - [x86] vboxfs: fix broken legacy mount signature checking + - drm/amdgpu: fix gart.bo pin_count leak + - scsi: ses: Fix unsigned comparison with less than zero + - scsi: virtio_scsi: Fix spelling mistake "Unsupport" -> "Unsupported" + - scsi: qla2xxx: Fix excessive messages during device logout + - perf/core: fix userpage->time_enabled of inactive events + - sched: Always inline is_percpu_thread() + - io_uring: kill fasync + - [armhf] hwmon: (pmbus/ibm-cffps) max_power_out swap changes + https://www.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.14.14 + - ALSA: usb-audio: Add quirk for VF0770 + - ALSA: pcm: Workaround for a wrong offset in SYNC_PTR compat ioctl + - ALSA: usb-audio: Fix a missing error check in scarlett gen2 mixer + - ALSA: seq: Fix a potential UAF by wrong private_free call order + - ALSA: hda/realtek: Enable 4-speaker output for Dell Precision 5560 laptop + - ALSA: hda - Enable headphone mic on Dell Latitude laptops with ALC3254 + - ALSA: hda/realtek: Complete partial device name to avoid ambiguity + - ALSA: hda/realtek: Add quirk for Clevo X170KM-G + - ALSA: hda/realtek - ALC236 headset MIC recording issue + - ALSA: hda/realtek: Add quirk for TongFang PHxTxX1 + - ALSA: hda/realtek: Fix for quirk to enable speaker output on the Lenovo + 13s Gen2 + - ALSA: hda/realtek: Fix the mic type detection issue for ASUS G551JW + - [amd64] platform/x86: amd-pmc: Add alternative acpi id for PMC controller + - dm: fix mempool NULL pointer race when completing IO + - [x86] ACPI: PM: Include alternate AMDI0005 id in special behaviour + - dm rq: don't queue request to blk-mq during DM suspend + - [s390x] fix strrchr() implementation + - drm/fbdev: Clamp fbdev surface size if too large + - [arm64] hugetlb: fix CMA gigantic page order for non-4K PAGE_SIZE + - drm/nouveau/fifo: Reinstate the correct engine bit programming + - [arm64] drm/msm: Do not run snapshot on non-DPU devices + - [arm64] drm/msm: Avoid potential overflow in timeout_to_jiffies() + - btrfs: unlock newly allocated extent buffer after error + - btrfs: deal with errors when replaying dir entry during log replay + - btrfs: deal with errors when adding inode reference during log replay + - btrfs: check for error when looking up inode during dir entry replay + - btrfs: update refs for any root except tree log roots + - btrfs: fix abort logic in btrfs_replace_file_extents + - [x86] resctrl: Free the ctrlval arrays when domain_setup_mon_state() fails + - [x86] mei: me: add Ice Lake-N device id. + - [x86] mei: hbm: drop hbm responses on early shutdown + - xhci: guard accesses to ep_state in xhci_endpoint_reset() + - xhci: add quirk for host controllers that don't update endpoint DCS + - xhci: Fix command ring pointer corruption while aborting a command + - xhci: Enable trust tx length quirk for Fresco FL11 USB controller + - cb710: avoid NULL pointer subtraction + - [arm64,x86] efi/cper: use stack buffer for error record decoding + - efi: Change down_interruptible() in virt_efi_reset_system() to + down_trylock() + - [armhf] usb: musb: dsps: Fix the probe error path + - Input: xpad - add support for another USB ID of Nacon GC-100 + - USB: serial: qcserial: add EM9191 QDL support + - USB: serial: option: add Quectel EC200S-CN module support + - USB: serial: option: add Telit LE910Cx composition 0x1204 + - USB: serial: option: add prod. id for Quectel EG91 + - virtio: write back F_VERSION_1 before validate + - nvmem: Fix shift-out-of-bound (UBSAN) with byte size cells + - virtio-blk: remove unneeded "likely" statements + - Revert "virtio-blk: Add validation for block size in config space" + - [x86] fpu: Mask out the invalid MXCSR bits properly + - [x86] Kconfig: Do not enable AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT + automatically + - [powerpc*] xive: Discard disabled interrupts in get_irqchip_state() + - [armhf] drivers: bus: simple-pm-bus: Add support for probing simple bus + only devices + - driver core: Reject pointless SYNC_STATE_ONLY device links + - iio: adc: ad7192: Add IRQ flag + - iio: adc: ad7780: Fix IRQ flag + - iio: adc: ad7793: Fix IRQ flag + - iio: adis16480: fix devices that do not support sleep mode + - iio: adc128s052: Fix the error handling path of 'adc128_probe()' + - iio: adc: max1027: Fix wrong shift with 12-bit devices + - iio: adis16475: fix deadlock on frequency set + - iio: light: opt3001: Fixed timeout error when 0 lux + - iio: adc: max1027: Fix the number of max1X31 channels + - eeprom: at25: Add SPI ID table + - iio: dac: ti-dac5571: fix an error code in probe() + - [arm64] tee: optee: Fix missing devices unregister during optee_remove + - [armel,armhf] dts: bcm2711-rpi-4-b: Fix usb's unit address + - [armel,armhf] dts: bcm2711-rpi-4-b: fix sd_io_1v8_reg regulator states + - [armel,armhf] dts: bcm2711-rpi-4-b: Fix pcie0's unit address formatting + - nvme-pci: Fix abort command id + - sctp: account stream padding length for reconf chunk + - [arm64,armhf] gpio: pca953x: Improve bias setting + - net/smc: improved fix wait on already cleared link + - net/mlx5e: Fix memory leak in mlx5_core_destroy_cq() error path + - net/mlx5e: Mutually exclude RX-FCS and RX-port-timestamp + - net/mlx5e: Switchdev representors are not vlan challenged + - net: stmmac: fix get_hw_feature() on old hardware + - net: phy: Do not shutdown PHYs in READY state + - [arm64,armhf] net: dsa: mv88e6xxx: don't use PHY_DETECT on internal PHY's + - [arm64,armhf] net: dsa: fix spurious error message when unoffloaded port + leaves bridge + - ethernet: s2io: fix setting mac address during resume + - nfc: fix error handling of nfc_proto_register() + - NFC: digital: fix possible memory leak in digital_tg_listen_mdaa() + - NFC: digital: fix possible memory leak in digital_in_send_sdd_req() + - pata_legacy: fix a couple uninitialized variable bugs + - ata: ahci_platform: fix null-ptr-deref in + ahci_platform_enable_regulators() + - spi: spidev: Add SPI ID table + - drm/edid: In connector_bad_edid() cap num_of_ext by num_blocks read + - [arm64] drm/msm: Fix null pointer dereference on pointer edp + - [arm64] drm/msm/mdp5: fix cursor-related warnings + - [arm64] drm/msm/submit: fix overflow check on 64-bit architectures + - [arm64] drm/msm/a6xx: Track current ctx by seqno + - [arm64] drm/msm/a4xx: fix error handling in a4xx_gpu_init() + - [arm64] drm/msm/a3xx: fix error handling in a3xx_gpu_init() + - [arm64] drm/msm/dsi: dsi_phy_14nm: Take ready-bit into account in + poll_for_ready + - [arm64] drm/msm/dsi: Fix an error code in msm_dsi_modeset_init() + - [arm64] drm/msm/dsi: fix off by one in dsi_bus_clk_enable error handling + - [arm64] acpi/arm64: fix next_platform_timer() section mismatch error + - [x86] platform/x86: intel_scu_ipc: Fix busy loop expiry time + - mqprio: Correct stats in mqprio_dump_class_stats(). + - mptcp: fix possible stall on recvmsg() + - qed: Fix missing error code in qed_slowpath_start() + - ice: fix locking for Tx timestamp tracking flush + - nfp: flow_offload: move flow_indr_dev_register from app init to app start + - [arm64] net: mscc: ocelot: make use of all 63 PTP timestamp identifiers + - [arm64] net: mscc: ocelot: avoid overflowing the PTP timestamp FIFO + - [arm64] net: mscc: ocelot: warn when a PTP IRQ is raised for an unknown + skb + - [arm64] net: mscc: ocelot: deny TX timestamping of non-PTP packets + - [arm64] net: mscc: ocelot: cross-check the sequence id from the timestamp + FIFO with the skb PTP header + - [arm64] net: dsa: felix: break at first CPU port during init and teardown + https://www.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.14.15 + - [armhf] dts: vexpress-v2p-ca9: Fix the SMB unit-address + - block: decode QUEUE_FLAG_HCTX_ACTIVE in debugfs output + - [x86] xen/x86: prevent PVH type from getting clobbered + - r8152: avoid to resubmit rx immediately + - drm/amdgpu: init iommu after amdkfd device init + - NFSD: Keep existing listeners on portlist error + - [powerpc*] powerpc/lib: Add helper to check if offset is within + conditional branch range + - [powerpc*] powerpc/bpf: Validate branch ranges + - [powerpc*] powerpc/security: Add a helper to query stf_barrier type + - [powerpc*] powerpc/bpf: Emit stf barrier instruction sequences for + BPF_NOSPEC + - [arm64] KVM: arm64: Fix host stage-2 PGD refcount + - [arm64] KVM: arm64: Release mmap_lock when using VM_SHARED with MTE + - netfilter: xt_IDLETIMER: fix panic that occurs when timer_type has garbage + value + - netfilter: nf_tables: skip netdev events generated on netns removal + - ice: Fix failure to re-add LAN/RDMA Tx queues + - ice: Avoid crash from unnecessary IDA free + - ice: fix getting UDP tunnel entry + - ice: Print the api_patch as part of the fw.mgmt.api + - netfilter: ip6t_rt: fix rt0_hdr parsing in rt_mt6 + - netfilter: ipvs: make global sysctl readonly in non-init netns + - sctp: fix transport encap_port update in sctp_vtag_verify + - tcp: md5: Fix overlap between vrf and non-vrf keys + - ipv6: When forwarding count rx stats on the orig netdev + - hamradio: baycom_epp: fix build for UML + - net/sched: act_ct: Fix byte count on fragmented packets + - [arm64,armhf] net: dsa: Fix an error handling path in + 'dsa_switch_parse_ports_of()' + - [powerpc*] smp: do not decrement idle task preempt count in CPU offline + - [arm64] net: hns3: Add configuration of TM QCN error event + - [arm64] net: hns3: reset DWRR of unused tc to zero + - [arm64] net: hns3: add limit ets dwrr bandwidth cannot be 0 + - [arm64] net: hns3: schedule the polling again when allocation fails + - [arm64] net: hns3: fix vf reset workqueue cannot exit + - [arm64] net: hns3: disable sriov before unload hclge layer + - net: stmmac: Fix E2E delay mechanism + - ptp: Fix possible memory leak in ptp_clock_register() + - e1000e: Fix packet loss on Tiger Lake and later + - igc: Update I226_K device ID + - ice: Add missing E810 device ids + - net/mlx5e: IPsec: Fix a misuse of the software parser's fields + - net/mlx5e: IPsec: Fix work queue entry ethernet segment checksum flags + - [arm64] net: enetc: fix ethtool counter name for PM0_TERR + - [arm64] net: enetc: make sure all traffic classes can send large frames + - can: peak_usb: pcan_usb_fd_decode_status(): fix back to ERROR_ACTIVE state + notification + - can: peak_pci: peak_pci_remove(): fix UAF + - can: isotp: isotp_sendmsg(): fix return error on FC timeout on TX path + - can: isotp: isotp_sendmsg(): add result check for + wait_event_interruptible() + - can: isotp: isotp_sendmsg(): fix TX buffer concurrent access in + isotp_sendmsg() + - can: j1939: j1939_tp_rxtimer(): fix errant alert in j1939_tp_rxtimer + - can: j1939: j1939_netdev_start(): fix UAF for rx_kref of j1939_priv + - can: j1939: j1939_xtp_rx_dat_one(): cancel session if receive TP.DT with + error length + - can: j1939: j1939_xtp_rx_rts_session_new(): abort TP less than 9 bytes + - ceph: skip existing superblocks that are blocklisted or shut down when + mounting + - ceph: fix handling of "meta" errors + - tracing: Have all levels of checks prevent recursion + - ocfs2: fix data corruption after conversion from inline format + - ocfs2: mount fails with buffer overflow in strlen + - userfaultfd: fix a race between writeprotect and exit_mmap() + - mm/mempolicy: do not allow illegal MPOL_F_NUMA_BALANCING | MPOL_LOCAL in + mbind() + - vfs: check fd has read access in kernel_read_file_from_fd() + - ALSA: usb-audio: Provide quirk for Sennheiser GSP670 Headset + - ALSA: hda/realtek: Add quirk for Clevo PC50HS + - ASoC: DAPM: Fix missing kctl change notifications + - [x86] ASoC: nau8824: Fix headphone vs headset, button-press detection no + longer working + - blk-cgroup: blk_cgroup_bio_start() should use irq-safe operations on + blkg->iostat_cpu + - audit: fix possible null-pointer dereference in audit_filter_rules + - ucounts: Move get_ucounts from cred_alloc_blank to + key_change_session_keyring + - ucounts: Pair inc_rlimit_ucounts with dec_rlimit_ucoutns in commit_creds + - ucounts: Proper error handling in set_cred_ucounts + - ucounts: Fix signal ucount refcounting + - [powerpc*] KVM: PPC: Book3S HV: Fix stack handling in + idle_kvm_start_guest() + - [powerpc*] KVM: PPC: Book3S HV: Make idle_kvm_start_guest() return 0 if it + went to guest (CVE-2021-43056) + - [powerpc*] idle: Don't corrupt back chain when going idle + - mm, slub: fix mismatch between reconstructed freelist depth and cnt + - mm, slub: fix potential memoryleak in kmem_cache_open() + - mm, slub: fix potential use-after-free in slab_debugfs_fops + - mm, slub: fix incorrect memcg slab count for bulk free + - [x86] KVM: nVMX: promptly process interrupts delivered while in guest mode + - [x86] KVM: SEV: Flush cache on non-coherent systems before + RECEIVE_UPDATE_DATA + - [x86] KVM: SEV-ES: rename guest_ins_data to sev_pio_data + - [x86] KVM: SEV-ES: clean up kvm_sev_es_ins/outs + - [x86] KVM: SEV-ES: keep INS functions together + - [x86] KVM: SEV-ES: fix length of string I/O + - [x86] KVM: SEV-ES: go over the sev_pio_data buffer in multiple passes if + needed + - [x86] KVM: SEV-ES: reduce ghcb_sa_len to 32 bits + - [x86] KVM: x86: leave vcpu->arch.pio.count alone in emulator_pio_in_out + - [x86] KVM: x86: check for interrupts before deciding whether to exit the + fast path + - [x86] KVM: x86: split the two parts of emulator_pio_in + - [x86] KVM: x86: remove unnecessary arguments from complete_emulator_pio_in + - nfc: nci: fix the UAF of rf_conn_info object (CVE-2021-3760) + - isdn: cpai: check ctr->cnr to avoid array index out of bound + (CVE-2021-3896) + - [sh4] net: bridge: mcast: use multicast_membership_interval for IGMPv3 + - [x86] KVM: SEV-ES: Set guest_state_protected after VMSA update + - [arm64] net: hns3: fix the max tx size according to user manual + - [x86] KVM: MMU: Reset mmu->pkru_mask to avoid stale data + - [arm64] drm/msm/a6xx: Serialize GMU communication + - ALSA: hda: intel: Allow repeatedly probing on codec configuration errors + - btrfs: deal with errors when checking if a dir entry exists during log + replay + - net: stmmac: add support for dwmac 3.40a + - [x86] platform/x86: intel_scu_ipc: Increase virtual timeout to 10s + - [x86] platform/x86: intel_scu_ipc: Update timeout value in comment + - ALSA: hda: avoid write to STATESTS if controller is in reset + - spi: Fix deadlock when adding SPI controllers on SPI buses + - spi-mux: Fix false-positive lockdep splats + - [x86] perf/x86/msr: Add Sapphire Rapids CPU support + - scsi: iscsi: Fix set_param() handling + - [x86] scsi: storvsc: Fix validation for unsolicited incoming packets + - scsi: qla2xxx: Fix a memory leak in an error path of qla2x00_process_els() + - mm/thp: decrease nr_thps in file's mapping on THP split + - sched/scs: Reset the shadow stack when idle_task_exit + - [arm64] net: hns3: fix for miscalculation of rx unused desc + - net/mlx5: Lag, move lag destruction to a workqueue + - net/mlx5: Lag, change multipath and bonding to be mutually exclusive + - autofs: fix wait name hash calculation in autofs_wait() + - scsi: core: Fix shost->cmd_per_lun calculation in scsi_add_host_with_dma() + - [s390x] pci: cleanup resources only if necessary + - [s390x] pci: fix zpci_zdev_put() on reserve + - net: mdiobus: Fix memory leak in __mdiobus_register + - e1000e: Separate TGP board type from SPT + - [armhf] pinctrl: stm32: use valid pin identifier in stm32_pinctrl_resume() + https://www.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.14.16 + - [armel,armhf] 9134/1: remove duplicate memcpy() definition + - [armel,armhf] 9139/1: kprobes: fix arch_init_kprobes() prototype + - [armel,armhf] 9148/1: handle CONFIG_CPU_ENDIAN_BE32 in + arch/arm/kernel/head.S + - usbnet: sanity check for maxpacket + - usbnet: fix error return code in usbnet_probe() + - pinctrl: amd: disable and mask interrupts on probe + - ata: sata_mv: Fix the error handling of mv_chip_id() + - tipc: fix size validations for the MSG_CRYPTO type (CVE-2021-43267) + - nfc: port100: fix using -ERRNO as command type mask + - Revert "net: mdiobus: Fix memory leak in __mdiobus_register" + - mmc: vub300: fix control-message timeouts + - mmc: cqhci: clear HALT state after CQE enable + - [armhf] mmc: dw_mmc: exynos: fix the finding clock sample value + - mmc: sdhci: Map more voltage level to SDHCI_POWER_330 + - mmc: sdhci-pci: Read card detect from ACPI for Intel Merrifield + - [arm64,armhf] mmc: sdhci-esdhc-imx: clear the buffer_read_ready to reset + standard tuning circuit + - block: Fix partition check for host-aware zoned block devices + - ocfs2: fix race between searching chunks and release journal_head from + buffer_head + - nvme-tcp: fix H2CData PDU send accounting (again) + - cfg80211: scan: fix RCU in cfg80211_add_nontrans_list() + - cfg80211: fix management registrations locking + - net: lan78xx: fix division by zero in send path + - mm: hwpoison: remove the unnecessary THP check + - mm: filemap: check if THP has hwpoisoned subpage for PMD page fault + - mm, thp: bail out early in collapse_file for writeback page + - mm: khugepaged: skip huge page collapse for special files + - [arm64] dts: imx8mm-kontron: Fix polarity of reg_rst_eth2 + - [arm64] dts: imx8mm-kontron: Fix CAN SPI clock frequency + - [arm64] dts: imx8mm-kontron: Fix connection type for VSC8531 RGMII PHY + - [arm64] dts: imx8mm-kontron: Set lower limit of VDD_SNVS to 800 mV + - [arm64] dts: imx8mm-kontron: Make sure SOC and DRAM supply voltages are + correct + - mac80211: mesh: fix HE operation element length check + - drm/ttm: fix memleak in ttm_transfered_destroy + - [x86] drm/i915: Convert unconditional clflush to drm_clflush_virt_range() + - [x86] drm/i915: Catch yet another unconditioal clflush + - [x86] drm/i915/dp: Skip the HW readout of DPCD on disabled encoders + - drm/amdgpu: Fix even more out of bound writes from debugfs + - drm/amdgpu: fix out of bounds write (CVE-2021-42327) + - drm/amdgpu: support B0&B1 external revision id for yellow carp + - drm/amd/display: Limit display scaling to up to true 4k for DCN 3.1 + - drm/amd/display: Fix prefetch bandwidth calculation for DCN3.1 + - drm/amd/display: increase Z9 latency to workaround underflow in Z9 + - drm/amd/display: Increase watermark latencies for DCN3.1 + - drm/amd/display: Moved dccg init to after bios golden init + - drm/amd/display: Fallback to clocks which meet requested voltage on DCN31 + - drm/amd/display: Fix deadlock when falling back to v2 from v3 + - Revert "watchdog: iTCO_wdt: Account for rebooting on second timeout" + - cgroup: Fix memory leak caused by missing cgroup_bpf_offline + - [riscv64] riscv, bpf: Fix potential NULL dereference + - tcp_bpf: Fix one concurrency problem in the tcp_bpf_send_verdict function + - bpf: Fix potential race in tail call compatibility check + - bpf: Fix error usage of map_fd and fdget() in generic_map_update_batch() + - [amd64] IB/qib: Protect from buffer overflow in struct qib_user_sdma_pkt + fields + - [amd64] IB/hfi1: Fix abba locking issue with sc_disable() + - nvmet-tcp: fix data digest pointer calculation + - nvme-tcp: fix data digest pointer calculation + - nvme-tcp: fix possible req->offset corruption + - ice: Respond to a NETDEV_UNREGISTER event for LAG + - RDMA/mlx5: Set user priority for DCT + - ice: check whether PTP is initialized in ice_ptp_release() + - [arm64] dts: allwinner: h5: NanoPI Neo 2: Fix ethernet node + - regmap: Fix possible double-free in regcache_rbtree_exit() + - net: batman-adv: fix error handling + - net-sysfs: initialize uid and gid before calling net_ns_get_ownership + - cfg80211: correct bridge/4addr mode check + - net: Prevent infinite while loop in skb_tx_hash() + - RDMA/mlx5: Initialize the ODP xarray when creating an ODP MR + - RDMA/sa_query: Use strscpy_pad instead of memcpy to copy a string + - net: ethernet: microchip: lan743x: Fix driver crash when lan743x_pm_resume + fails + - net: ethernet: microchip: lan743x: Fix dma allocation failure by using + dma_set_mask_and_coherent + - [arm64] net: hns3: fix pause config problem after autoneg disabled + - [arm64] net: hns3: fix data endian problem of some functions of debugfs + - net: ethernet: microchip: lan743x: Fix skb allocation failure + - phy: phy_ethtool_ksettings_get: Lock the phy for consistency + - phy: phy_ethtool_ksettings_set: Move after phy_start_aneg + - phy: phy_start_aneg: Add an unlocked version + - phy: phy_ethtool_ksettings_set: Lock the PHY while changing settings + - sctp: use init_tag from inithdr for ABORT chunk (CVE-2021-3772) + - sctp: fix the processing for INIT chunk (CVE-2021-3772) + - sctp: fix the processing for INIT_ACK chunk (CVE-2021-3772) + - sctp: fix the processing for COOKIE_ECHO chunk (CVE-2021-3772) + - sctp: add vtag check in sctp_sf_violation (CVE-2021-3772) + - sctp: add vtag check in sctp_sf_do_8_5_1_E_sa (CVE-2021-3772) + - sctp: add vtag check in sctp_sf_ootb (CVE-2021-3772) + - bpf: Use kvmalloc for map values in syscall + - [arm64] watchdog: sbsa: only use 32-bit accessors + - bpf: Move BPF_MAP_TYPE for INODE_STORAGE and TASK_STORAGE outside of + CONFIG_NET + - [arm64] net: hns3: add more string spaces for dumping packets number of + queue info in debugfs + - [arm64] net: hns3: expand buffer len for some debugfs command + - virtio-ring: fix DMA metadata flags + - [s390x] KVM: s390: clear kicked_mask before sleeping again + - [s390x] KVM: s390: preserve deliverable_mask in __airqs_kick_single_vcpu + - [powerpc*] scsi: ibmvfc: Fix up duplicate response detection + - [riscv64] fix misalgned trap vector base address + - [x86] KVM: switch pvclock_gtod_sync_lock to a raw spinlock + - [x86] KVM: SEV-ES: fix another issue with string I/O VMGEXITs + - [x86] KVM: Take srcu lock in post_kvm_run_save() + + [ Salvatore Bonaccorso ] + * Revert "[amd64] Unset AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT" + * Bump ABI to 4 + * media: ir-kbd-i2c: improve responsiveness of hauppauge zilog receivers + (Closes: #994050) + * [x86] media: ite-cir: IR receiver stop working after receive overflow + (Closes: #996672) + * scsi: core: Put LLD module refcnt after SCSI device is released + * sfc: Fix reading non-legacy supported link modes + * vrf: Revert "Reset skb conntrack connection..." + * media: firewire: firedtv-avc: fix a buffer overflow in avc_ca_pmt() + (CVE-2021-42739) + + -- Salvatore Bonaccorso <carnil@debian.org> Wed, 03 Nov 2021 15:35:31 +0100 + +linux (5.14.12-1) unstable; urgency=medium + + * New upstream stable update: + https://www.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.14.10 + - [arm64,armhf] media: cedrus: Fix SUNXI tile size calculation + - [arm64] ASoC: fsl_sai: register platform component before registering cpu + dai + - [armhf] ASoC: fsl_spdif: register platform component before registering + cpu dai + - [x86] ASoC: SOF: Fix DSP oops stack dump output contents + - [arm64] pinctrl: qcom: spmi-gpio: correct parent irqspec translation + - net/mlx4_en: Resolve bad operstate value + - [s390x] qeth: Fix deadlock in remove_discipline + - [s390x] qeth: fix deadlock during failing recovery + - [x86] crypto: ccp - fix resource leaks in ccp_run_aes_gcm_cmd() + (CVE-2021-3744, CVE-2021-3764) + - [m68k] Update ->thread.esp0 before calling syscall_trace() in + ret_from_signal + - [amd64] HID: amd_sfh: Fix potential NULL pointer dereference + - tty: Fix out-of-bound vmalloc access in imageblit + - cpufreq: schedutil: Use kobject release() method to free sugov_tunables + - scsi: qla2xxx: Changes to support kdump kernel for NVMe BFS + - drm/amdgpu: adjust fence driver enable sequence + - drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2) + - drm/amdgpu: stop scheduler when calling hw_fini (v2) + - cpufreq: schedutil: Destroy mutex before kobject_put() frees the memory + - scsi: ufs: ufs-pci: Fix Intel LKF link stability + - ALSA: rawmidi: introduce SNDRV_RAWMIDI_IOCTL_USER_PVERSION + - ALSA: firewire-motu: fix truncated bytes in message tracepoints + - ALSA: hda/realtek: Quirks to enable speaker output for Lenovo Legion 7i + 15IMHG05, Yoga 7i 14ITL5/15ITL5, and 13s Gen2 laptops. + - [amd64,arm64] ACPI: NFIT: Use fallback node id when numa info in NFIT + table is incorrect + - fs-verity: fix signed integer overflow with i_size near S64_MAX + - hwmon: (tmp421) handle I2C errors + - hwmon: (w83793) Fix NULL pointer dereference by removing unnecessary + structure field + - hwmon: (w83792d) Fix NULL pointer dereference by removing unnecessary + structure field + - hwmon: (w83791d) Fix NULL pointer dereference by removing unnecessary + structure field + - [arm64,armhf] gpio: pca953x: do not ignore i2c errors + - scsi: ufs: Fix illegal offset in UPIU event trace + - mac80211: fix use-after-free in CCMP/GCMP RX + - [x86] platform/x86/intel: hid: Add DMI switches allow list + - [x86] kvmclock: Move this_cpu_pvti into kvmclock.h + - [x86] ptp: Fix ptp_kvm_getcrosststamp issue for x86 ptp_kvm + - [x86] KVM: x86: Fix stack-out-of-bounds memory access from + ioapic_write_indirect() + - [x86] KVM: x86: nSVM: don't copy virt_ext from vmcb12 + - [x86] KVM: x86: Clear KVM's cached guest CR3 at RESET/INIT + - [x86] KVM: x86: Swap order of CPUID entry "index" vs. "significant flag" + checks + - [x86] KVM: nVMX: Filter out all unsupported controls when eVMCS was + activated + - [x86] KVM: SEV: Update svm_vm_copy_asid_from for SEV-ES + - [x86] KVM: SEV: Pin guest memory for write for RECEIVE_UPDATE_DATA + - [x86] KVM: SEV: Acquire vcpu mutex when updating VMSA + - [x86] KVM: SEV: Allow some commands for mirror VM + - [x86] KVM: SVM: fix missing sev_decommission in sev_receive_start + - [x86] KVM: nVMX: Fix nested bus lock VM exit + - [x86] KVM: VMX: Fix a TSX_CTRL_CPUID_CLEAR field mask issue + - RDMA/cma: Do not change route.addr.src_addr.ss_family + - RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more requests + - nbd: use shifts rather than multiplies + - drm/amd/display: initialize backlight_ramping_override to false + - drm/amd/display: Pass PCI deviceid into DC + - drm/amd/display: Fix Display Flicker on embedded panels + - drm/amdgpu: force exit gfxoff on sdma resume for rmb s0ix + - drm/amdgpu: check tiling flags when creating FB on GFX8- + - drm/amdgpu: correct initial cp_hqd_quantum for gfx9 + - [amd64] drm/i915/gvt: fix the usage of ww lock in gvt scheduler. + - ipvs: check that ip_vs_conn_tab_bits is between 8 and 20 + - bpf: Handle return value of BPF_PROG_TYPE_STRUCT_OPS prog + - IB/cma: Do not send IGMP leaves for sendonly Multicast groups + - RDMA/cma: Fix listener leak in rdma_cma_listen_on_all() failure + - netfilter: nf_tables: unlink table before deleting it + - netfilter: log: work around missing softdep backend module + - Revert "mac80211: do not use low data rates for data frames with no ack + flag" + - mac80211: Fix ieee80211_amsdu_aggregate frag_tail bug + - mac80211: limit injected vht mcs/nss in ieee80211_parse_tx_radiotap + - mac80211: mesh: fix potentially unaligned access + - mac80211-hwsim: fix late beacon hrtimer handling + - driver core: fw_devlink: Add support for + FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD + - net: mdiobus: Set FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD for mdiobus parents + - sctp: break out if skb_header_pointer returns NULL in sctp_rcv_ootb + - mptcp: don't return sockets in foreign netns + - mptcp: allow changing the 'backup' bit when no sockets are open + - [arm64] RDMA/hns: Work around broken constant propagation in gcc 8 + - hwmon: (tmp421) report /PVLD condition as fault + - hwmon: (tmp421) fix rounding for negative values + - [arm64] net: enetc: fix the incorrect clearing of IF_MODE bits + - net: ipv4: Fix rtnexthop len when RTA_FLOW is present + - smsc95xx: fix stalled rx after link change + - [x86] drm/i915/request: fix early tracepoints + - [x86] drm/i915: Remove warning from the rps worker + - [arm64,armhf] dsa: mv88e6xxx: 6161: Use chip wide MAX MTU + - [arm64,armhf] dsa: mv88e6xxx: Fix MTU definition + - [arm64,armhf] dsa: mv88e6xxx: Include tagger overhead when setting MTU for + DSA and CPU ports + - e100: fix length calculation in e100_get_regs_len + - e100: fix buffer overrun in e100_get_regs + - [amd64] RDMA/hfi1: Fix kernel pointer leak + - [arm64] RDMA/hns: Fix the size setting error when copying CQE in + clean_cq() + - [arm64] RDMA/hns: Add the check of the CQE size of the user space + - bpf: Exempt CAP_BPF from checks against bpf_jit_limit + - [amd64] bpf, x86: Fix bpf mapping of atomic fetch implementation + - Revert "block, bfq: honor already-setup queue merges" + - scsi: csiostor: Add module softdep on cxgb4 + - ixgbe: Fix NULL pointer dereference in ixgbe_xdp_setup + - [arm64] net: hns3: do not allow call hns3_nic_net_open repeatedly + - [arm64] net: hns3: remove tc enable checking + - [arm64] net: hns3: don't rollback when destroy mqprio fail + - [arm64] net: hns3: fix mixed flag HCLGE_FLAG_MQPRIO_ENABLE and + HCLGE_FLAG_DCB_ENABLE + - [arm64] net: hns3: fix show wrong state when add existing uc mac address + - [arm64] net: hns3: reconstruct function hns3_self_test + - [arm64] net: hns3: fix always enable rx vlan filter problem after selftest + - [arm64] net: hns3: disable firmware compatible features when uninstall PF + - [arm64,armhf] net: phy: bcm7xxx: Fixed indirect MMD operations + - net: sched: flower: protect fl_walk() with rcu + - net: stmmac: fix EEE init issue when paired with EEE capable PHYs + - af_unix: fix races in sk_peer_pid and sk_peer_cred accesses + - [x86] perf/x86/intel: Update event constraints for ICX + - sched/fair: Add ancestors of unthrottled undecayed cfs_rq + - sched/fair: Null terminate buffer when updating tunable_scaling + - [armhf] hwmon: (occ) Fix P10 VRM temp sensors + - [x86] kvm: fix objtool relocation warning + - nvme: add command id quirk for apple controllers + - elf: don't use MAP_FIXED_NOREPLACE for elf interpreter mappings + - driver core: fw_devlink: Improve handling of cyclic dependencies + - debugfs: debugfs_create_file_size(): use IS_ERR to check for error + - ext4: fix loff_t overflow in ext4_max_bitmap_size() + - ext4: fix reserved space counter leakage + - ext4: add error checking to ext4_ext_replay_set_iblocks() + - ext4: fix potential infinite loop in ext4_dx_readdir() + - ext4: flush s_error_work before journal destroy in ext4_fill_super + - HID: u2fzero: ignore incomplete packets without data (Closes: #994535) + - net: udp: annotate data race around udp_sk(sk)->corkflag + - usb: hso: remove the bailout parameter + - HID: betop: fix slab-out-of-bounds Write in betop_probe + - netfilter: ipset: Fix oversized kvmalloc() calls + - mm: don't allow oversized kvmalloc() calls + - HID: usbhid: free raw_report buffers in usbhid_stop + - [x86] crypto: aesni - xts_crypt() return if walk.nbytes is 0 + - [x86] KVM: x86: Handle SRCU initialization failure during page track init + - netfilter: conntrack: serialize hash resizes and cleanups + - netfilter: nf_tables: Fix oversized kvmalloc() calls + - [amd64] HID: amd_sfh: Fix potential NULL pointer dereference - take 2 + https://www.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.14.11 + - [arm64,armhf] spi: rockchip: handle zero length transfers without timing + out + - afs: Add missing vnode validation checks + - nfsd: back channel stuck in SEQ4_STATUS_CB_PATH_DOWN + - btrfs: replace BUG_ON() in btrfs_csum_one_bio() with proper error handling + - btrfs: fix mount failure due to past and transient device flush error + - net: mdio: introduce a shutdown method to mdio device drivers + - xen-netback: correct success/error reporting for the SKB-with-fraglist + case + - [sparc64] fix pci_iounmap() when CONFIG_PCI is not set + - scsi: sd: Free scsi_disk device via put_device() + - [arm*] usb: dwc2: check return value after calling platform_get_resource() + - Xen/gntdev: don't ignore kernel unmapping error + - swiotlb-xen: ensure to issue well-formed XENMEM_exchange requests + - nvme-fc: update hardware queues before using them + - nvme-fc: avoid race between time out and tear down + - [arm64] thermal/drivers/tsens: Fix wrong check for tzd in irq handlers + - scsi: ses: Retry failed Send/Receive Diagnostic commands + - [arm64,armhf] irqchip/gic: Work around broken Renesas integration + - smb3: correct smb3 ACL security descriptor + - [x86] insn, tools/x86: Fix undefined behavior due to potential unaligned + accesses + - io_uring: allow conditional reschedule for intensive iterators + - block: don't call rq_qos_ops->done_bio if the bio isn't tracked + - KVM: do not shrink halt_poll_ns below grow_start + - [x86] KVM: x86: reset pdptrs_from_userspace when exiting smm + - [x86] kvm: x86: Add AMD PMU MSRs to msrs_to_save_all[] + - [x86] KVM: x86: nSVM: restore int_vector in svm_clear_vintr + - [x86] perf/x86: Reset destroy callback on event init failure + - libata: Add ATA_HORKAGE_NO_NCQ_ON_ATI for Samsung 860 and 870 SSD. + - Revert "brcmfmac: use ISO3166 country code and 0 rev as fallback" + - [armhf] Revert "ARM: imx6q: drop of_platform_default_populate() from + init_machine" + https://www.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.14.12 + - usb: cdc-wdm: Fix check for WWAN + - [arm64,armhf] usb: chipidea: ci_hdrc_imx: Also search for 'phys' phandle + - usb: gadget: f_uac2: fixed EP-IN wMaxPacketSize + - USB: cdc-acm: fix racy tty buffer accesses + - USB: cdc-acm: fix break reporting + - usb: typec: tcpm: handle SRC_STARTUP state if cc changes + - [x86] usb: typec: tipd: Remove dependency on "connector" child fwnode + - drm/amdgpu: During s0ix don't wait to signal GFXOFF + - drm/nouveau/kms/tu102-: delay enabling cursor until after assign_windows + - drm/nouveau/ga102-: support ttm buffer moves via copy engine + - [x86] drm/i915: Fix runtime pm handling in i915_gem_shrink + - [x86] drm/i915: Extend the async flip VT-d w/a to skl/bxt + - xen/privcmd: fix error handling in mmap-resource processing + - [arm64] mmc: meson-gx: do not use memcpy_to/fromio for dram-access-quirk + - ovl: fix missing negative dentry check in ovl_rename() + - ovl: fix IOCB_DIRECT if underlying fs doesn't support direct IO + - nfsd: fix error handling of register_pernet_subsys() in init_nfsd() + - nfsd4: Handle the NFSv4 READDIR 'dircount' hint being zero + - SUNRPC: fix sign error causing rpcsec_gss drops + - xen/balloon: fix cancelled balloon action + - [armhf] dts: omap3430-sdp: Fix NAND device node + - scsi: ufs: core: Fix task management completion + - [riscv64] Flush current cpu icache before other cpus + - [armhf] bus: ti-sysc: Add break in switch statement in sysc_init_soc() + - iwlwifi: mvm: Fix possible NULL dereference + - [arm64] soc: qcom: mdt_loader: Drop PT_LOAD check on hash segment + - [armhf] dts: imx: Add missing pinctrl-names for panel on M53Menlo + - [armhf] dts: imx: Fix USB host power regulator polarity on M53Menlo + - [amd64] PCI: hv: Fix sleep while in non-sleep context when removing child + devices from the bus + - iwlwifi: pcie: add configuration of a Wi-Fi adapter on Dell XPS 15 + - netfilter: conntrack: fix boot failure with nf_conntrack.enable_hooks=1 + - netfilter: nf_tables: add position handle in event notification + - netfilter: nf_tables: reverse order in rule replacement expansion + - [armel,armhf] bpf, arm: Fix register clobbering in div/mod implementation + - [armhf] soc: ti: omap-prm: Fix external abort for am335x pruss + - bpf: Fix integer overflow in prealloc_elems_and_freelist() + (CVE-2021-41864) + - net/mlx5e: IPSEC RX, enable checksum complete + - net/mlx5e: Keep the value for maximum number of channels in-sync + - net/mlx5: E-Switch, Fix double allocation of acl flow counter + - net/mlx5: Force round second at 1PPS out start time + - net/mlx5: Avoid generating event after PPS out in Real time mode + - net/mlx5: Fix length of irq_index in chars + - net/mlx5: Fix setting number of EQs of SFs + - net/mlx5e: Fix the presented RQ index in PTP stats + - phy: mdio: fix memory leak + - net_sched: fix NULL deref in fifo_set_limit() + - [arm64] net: mscc: ocelot: fix VCAP filters remaining active after being + deleted + - [arm64,armhf] net: stmmac: dwmac-rk: Fix ethernet on rk3399 based devices + - [mips*] Revert "add support for buggy MT7621S core detection" + - netfilter: nf_tables: honor NLM_F_CREATE and NLM_F_EXCL in event + notification + - [i386] ptp_pch: Load module automatically if ID matches + - [armhf] dts: imx: change the spi-nor tx + - [arm64] dts: imx8: change the spi-nor tx + - [armhf] imx6: disable the GIC CPU interface before calling stby-poweroff + sequence + - [x86] drm/i915/audio: Use BIOS provided value for RKL HDA link + - [x86] drm/i915/jsl: Add W/A 1409054076 for JSL + - [x86] drm/i915/tc: Fix TypeC port init/resume time sanitization + - [x86] drm/i915/bdb: Fix version check + - netfs: Fix READ/WRITE confusion when calling iov_iter_xarray() + - afs: Fix afs_launder_page() to set correct start file position + - net: bridge: use nla_total_size_64bit() in br_get_linkxstats_size() + - net: bridge: fix under estimation in br_get_linkxstats_size() + - net/sched: sch_taprio: properly cancel timer from taprio_destroy() + - net: sfp: Fix typo in state machine debug string + - net: pcs: xpcs: fix incorrect CL37 AN sequence + - netlink: annotate data races around nlk->bound + - drm/amdgpu: handle the case of pci_channel_io_frozen only in + amdgpu_pci_resume + - [armhf] bus: ti-sysc: Use CLKDM_NOAUTO for dra7 dcan1 for errata i893 + - [arm64,armhf] drm/sun4i: dw-hdmi: Fix HDMI PHY clock setup + - drm/nouveau: avoid a use-after-free when BO init fails + - drm/nouveau/kms/nv50-: fix file release memory leak + - drm/nouveau/debugfs: fix file release memory leak + - net: pcs: xpcs: fix incorrect steps on disable EEE + - net: stmmac: trigger PCS EEE to turn off on link down + - [amd64,arm64] gve: Correct available tx qpl check + - [amd64,arm64] gve: Avoid freeing NULL pointer + - [amd64,arm64] gve: Properly handle errors in gve_assign_qpl + - rtnetlink: fix if_nlmsg_stats_size() under estimation + - [amd64,arm64] gve: fix gve_get_stats() + - [amd64,arm64] gve: report 64bit tx_bytes counter from + gve_handle_report_stats() + - i40e: fix endless loop under rtnl + - i40e: Fix freeing of uninitialized misc IRQ vector + - iavf: fix double unlock of crit_lock + - net: prefer socket bound to interface when not in VRF + - [powerpc*] iommu: Report the correct most efficient DMA mask for PCI + devices + - i2c: acpi: fix resource leak in reconfiguration device addition + - [riscv64] explicitly use symbol offsets for VDSO + - [riscv64] vdso: Refactor asm/vdso.h + - [riscv64] vdso: Move vdso data page up front + - [riscv64] vdso: make arch_setup_additional_pages wait for mmap_sem for + write killable + - [s390x] bpf, s390: Fix potential memory leak about jit_data + - [riscv64] Include clone3() on rv32 + - scsi: iscsi: Fix iscsi_task use after free + - [powerpc*] bpf: Fix BPF_MOD when imm == 1 + - [powerpc*] bpf: Fix BPF_SUB when imm == 0x80000000 + - [powerpc*] 64s: fix program check interrupt emergency stack path + - [powerpc*] traps: do not enable irqs in _exception + - [powerpc*] 64s: Fix unrecoverable MCE calling async handler from NMI + - [powerpc*] pseries/eeh: Fix the kdump kernel crash during eeh_pseries_init + - [i386] x86/platform/olpc: Correct ifdef symbol to intended + CONFIG_OLPC_XO15_SCI + - [x86] fpu: Restore the masking out of reserved MXCSR bits + - [x86] entry: Correct reference to intended CONFIG_64_BIT + - [x86] hpet: Use another crystalball to evaluate HPET usability + - [arm64,armhf] dsa: tag_dsa: Fix mask for trunked packets + + [ Ben Hutchings ] + * debian/.gitignore: Ignore debian/tests/control again + * integrity: Drop "MODSIGN: load blacklist from MOKx" as redundant after 5.13 + * tools/perf: Fix warning introduced by "tools/perf: pmu-events: Fix + reproducibility" + * debian/rules.real: Stop invoking obsolete headers_check target + * libcpupower: Update symbols file for changes in 5.13.9-1~exp1 + + [ John Paul Adrian Glaubitz ] + * [alpha] Re-enable CONFIG_EISA which was disabled upstream by accident + + [ Salvatore Bonaccorso ] + * Bump ABI to 3 + * mm/secretmem: Fix NULL page->mapping dereference in page_is_secretmem() + (Closes: #996175) + + [ Aurelien Jarno ] + * [riscv64] Improve HiFive Unmatched support: enable SENSORS_LM90. + + -- Salvatore Bonaccorso <carnil@debian.org> Thu, 14 Oct 2021 08:39:01 +0200 + linux (5.14.9-2~bpo11+1) bullseye-backports; urgency=medium * Rebuild for bullseye-backports: diff --git a/debian/config/alpha/config b/debian/config/alpha/config index 6ae0a8464..7b649fa78 100644 --- a/debian/config/alpha/config +++ b/debian/config/alpha/config @@ -169,6 +169,7 @@ CONFIG_IPMI_POWEROFF=m ## ## file: drivers/eisa/Kconfig ## +CONFIG_EISA=y CONFIG_EISA_PCI_EISA=y CONFIG_EISA_VIRTUAL_ROOT=y CONFIG_EISA_NAMES=y @@ -503,7 +504,7 @@ CONFIG_B44=m ## ## file: drivers/net/ethernet/cirrus/Kconfig ## -CONFIG_CS89x0=m +CONFIG_CS89x0_ISA=m ## ## file: drivers/net/ethernet/dec/tulip/Kconfig @@ -951,4 +952,3 @@ CONFIG_SND_YMFPCI=m ## file: sound/pci/hda/Kconfig ## CONFIG_SND_HDA_INTEL=m - diff --git a/debian/config/alpha/config.alpha-generic b/debian/config/alpha/config.alpha-generic index d63bf76e6..50398c7a8 100644 --- a/debian/config/alpha/config.alpha-generic +++ b/debian/config/alpha/config.alpha-generic @@ -7,4 +7,3 @@ ## file: arch/alpha/Kconfig.debug ## # CONFIG_ALPHA_LEGACY_START_ADDRESS is not set - diff --git a/debian/config/alpha/config.alpha-smp b/debian/config/alpha/config.alpha-smp index 19288e398..7b53986db 100644 --- a/debian/config/alpha/config.alpha-smp +++ b/debian/config/alpha/config.alpha-smp @@ -13,4 +13,3 @@ CONFIG_NR_CPUS=64 ## file: drivers/scsi/Kconfig ## CONFIG_SCSI=y - diff --git a/debian/config/amd64/config b/debian/config/amd64/config index c1b6f5ea7..ed380d167 100644 --- a/debian/config/amd64/config +++ b/debian/config/amd64/config @@ -260,4 +260,3 @@ CONFIG_LSM_MMAP_MIN_ADDR=65536 CONFIG_SND_SOC_AMD_ACP3x=m CONFIG_SND_SOC_AMD_RENOIR=m CONFIG_SND_SOC_AMD_RENOIR_MACH=m - diff --git a/debian/config/amd64/config.cloud-amd64 b/debian/config/amd64/config.cloud-amd64 index 1e7a408a4..da78f3d31 100644 --- a/debian/config/amd64/config.cloud-amd64 +++ b/debian/config/amd64/config.cloud-amd64 @@ -43,4 +43,3 @@ ## file: drivers/watchdog/Kconfig ## # CONFIG_PCIPCWATCHDOG is not set - diff --git a/debian/config/amd64/defines b/debian/config/amd64/defines index b278f3622..975efaa90 100644 --- a/debian/config/amd64/defines +++ b/debian/config/amd64/defines @@ -15,7 +15,7 @@ install-stem: vmlinuz breaks: xserver-xorg-input-vmmouse (<< 1:13.0.99) [relations] -headers%gcc-10: linux-compiler-gcc-10-x86 +headers%gcc-11: linux-compiler-gcc-11-x86 [amd64_description] hardware: 64-bit PCs diff --git a/debian/config/arm64/config b/debian/config/arm64/config index 541f7664e..b37ee9fa1 100644 --- a/debian/config/arm64/config +++ b/debian/config/arm64/config @@ -225,6 +225,7 @@ CONFIG_QORIQ_CPUFREQ=m ## CONFIG_ACPI_CPPC_CPUFREQ=m CONFIG_ARM_ARMADA_37XX_CPUFREQ=m +CONFIG_ARM_IMX_CPUFREQ_DT=m CONFIG_ARM_RASPBERRYPI_CPUFREQ=m ## @@ -269,6 +270,8 @@ CONFIG_CRYPTO_DEV_MARVELL_CESA=m ## ## file: drivers/devfreq/Kconfig ## +CONFIG_ARM_IMX_BUS_DEVFREQ=m +CONFIG_ARM_IMX8M_DDRC_DEVFREQ=m CONFIG_ARM_RK3399_DMC_DEVFREQ=m ## @@ -285,6 +288,7 @@ CONFIG_DMA_BCM2835=y CONFIG_DMA_SUN6I=m CONFIG_FSL_EDMA=m CONFIG_FSL_QDMA=m +CONFIG_IMX_SDMA=m CONFIG_K3_DMA=m CONFIG_MV_XOR=y CONFIG_MV_XOR_V2=y @@ -312,6 +316,7 @@ CONFIG_EDAC_XGENE=m ## file: drivers/extcon/Kconfig ## CONFIG_EXTCON=m +CONFIG_EXTCON_PTN5150=m CONFIG_EXTCON_QCOM_SPMI_MISC=m CONFIG_EXTCON_USB_GPIO=m CONFIG_EXTCON_USBC_CROS_EC=m @@ -362,6 +367,7 @@ CONFIG_DRM_AST=m ## ## file: drivers/gpu/drm/bridge/Kconfig ## +CONFIG_DRM_NWL_MIPI_DSI=m CONFIG_DRM_NXP_PTN3460=m ## @@ -381,6 +387,11 @@ CONFIG_DRM_ANALOGIX_ANX6345=m CONFIG_DRM_DW_HDMI_CEC=m ## +## file: drivers/gpu/drm/etnaviv/Kconfig +## +CONFIG_DRM_ETNAVIV=m + +## ## file: drivers/gpu/drm/hisilicon/hibmc/Kconfig ## CONFIG_DRM_HISI_HIBMC=m @@ -409,6 +420,11 @@ CONFIG_DRM_MSM_DSI_28NM_PHY=y CONFIG_DRM_MSM_DSI_20NM_PHY=y ## +## file: drivers/gpu/drm/mxsfb/Kconfig +## +CONFIG_DRM_MXSFB=m + +## ## file: drivers/gpu/drm/nouveau/Kconfig ## CONFIG_NOUVEAU_PLATFORM_DRIVER=y @@ -418,6 +434,7 @@ CONFIG_NOUVEAU_PLATFORM_DRIVER=y ## CONFIG_DRM_PANEL_SIMPLE=m CONFIG_DRM_PANEL_RASPBERRYPI_TOUCHSCREEN=m +CONFIG_DRM_PANEL_SITRONIX_ST7703=m ## ## file: drivers/gpu/drm/panfrost/Kconfig @@ -453,6 +470,11 @@ CONFIG_DRM_VC4=m CONFIG_DRM_VC4_HDMI_CEC=y ## +## file: drivers/gpu/drm/vmwgfx/Kconfig +## +CONFIG_DRM_VMWGFX=m + +## ## file: drivers/gpu/host1x/Kconfig ## CONFIG_TEGRA_HOST1X=m @@ -467,6 +489,7 @@ CONFIG_I2C_HID_OF=m ## file: drivers/hwmon/Kconfig ## CONFIG_SENSORS_ARM_SCPI=m +CONFIG_SENSORS_GPIO_FAN=m CONFIG_SENSORS_LM75=m CONFIG_SENSORS_LM90=m CONFIG_SENSORS_PWM_FAN=m @@ -531,6 +554,7 @@ CONFIG_INFINIBAND_HNS_HIP08=y ## CONFIG_KEYBOARD_ADC=m CONFIG_KEYBOARD_GPIO=m +CONFIG_KEYBOARD_SNVS_PWRKEY=m CONFIG_KEYBOARD_TEGRA=m CONFIG_KEYBOARD_CROS_EC=m @@ -540,6 +564,7 @@ CONFIG_KEYBOARD_CROS_EC=m CONFIG_INPUT_MISC=y CONFIG_INPUT_PM8941_PWRKEY=m CONFIG_INPUT_GPIO_BEEPER=m +CONFIG_INPUT_GPIO_VIBRA=m CONFIG_INPUT_AXP20X_PEK=m CONFIG_INPUT_UINPUT=m CONFIG_INPUT_RK805_PWRKEY=m @@ -555,6 +580,8 @@ CONFIG_MOUSE_ELAN_I2C=m ## CONFIG_INPUT_TOUCHSCREEN=y CONFIG_TOUCHSCREEN_ELAN=m +CONFIG_TOUCHSCREEN_EDT_FT5X06=m +CONFIG_TOUCHSCREEN_RASPBERRYPI_FW=m ## ## file: drivers/iommu/Kconfig @@ -578,6 +605,11 @@ CONFIG_LEDS_GPIO=m CONFIG_LEDS_PWM=m ## +## file: drivers/leds/trigger/Kconfig +## +CONFIG_LEDS_TRIGGER_HEARTBEAT=y + +## ## file: drivers/mailbox/Kconfig ## CONFIG_MAILBOX=y @@ -617,6 +649,7 @@ CONFIG_MFD_QCOM_RPM=m CONFIG_MFD_SPMI_PMIC=m CONFIG_MFD_RK808=y CONFIG_MFD_SL28CPLD=y +CONFIG_MFD_ROHM_BD718XX=m ## ## file: drivers/misc/Kconfig @@ -658,16 +691,31 @@ CONFIG_MMC_BCM2835=m CONFIG_MMC_SDHCI_XENON=m ## +## file: drivers/mtd/spi-nor/Kconfig +## +CONFIG_MTD_SPI_NOR=m + +## ## file: drivers/mtd/spi-nor/controllers/Kconfig ## CONFIG_SPI_HISI_SFC=m ## +## file: drivers/net/Kconfig +## +CONFIG_VMXNET3=m + +## ## file: drivers/net/can/Kconfig ## CONFIG_CAN_FLEXCAN=m ## +## file: drivers/net/can/spi/Kconfig +## +CONFIG_CAN_MCP251X=m + +## ## file: drivers/net/dsa/Kconfig ## CONFIG_NET_DSA_MV88E6060=m @@ -881,9 +929,15 @@ CONFIG_MESON_GXL_PHY=m CONFIG_BCM54140_PHY=m CONFIG_MARVELL_PHY=m CONFIG_MARVELL_10G_PHY=m +CONFIG_MICROSEMI_PHY=m CONFIG_AT803X_PHY=m ## +## file: drivers/net/usb/Kconfig +## +CONFIG_USB_NET_SMSC95XX=m + +## ## file: drivers/net/wireless/ath/wcn36xx/Kconfig ## CONFIG_WCN36XX=m @@ -993,6 +1047,7 @@ CONFIG_ARM_CCI5xx_PMU=y CONFIG_ARM_CCN=y CONFIG_ARM_CMN=m CONFIG_ARM_SMMU_V3_PMU=m +CONFIG_FSL_IMX8_DDR_PMU=m CONFIG_QCOM_L2_PMU=y CONFIG_QCOM_L3_PMU=y CONFIG_XGENE_PMU=y @@ -1018,6 +1073,12 @@ CONFIG_PHY_SUN4I_USB=m CONFIG_PHY_MESON8B_USB2=m ## +## file: drivers/phy/freescale/Kconfig +## +CONFIG_PHY_FSL_IMX8MQ_USB=m +CONFIG_PHY_MIXEL_MIPI_DPHY=m + +## ## file: drivers/phy/hisilicon/Kconfig ## CONFIG_PHY_HI6220_USB=m @@ -1111,6 +1172,7 @@ CONFIG_AXP20X_POWER=m CONFIG_AXP288_FUEL_GAUGE=m CONFIG_CHARGER_GPIO=m CONFIG_CHARGER_QCOM_SMBB=m +CONFIG_CHARGER_BQ25890=m CONFIG_CHARGER_CROS_USBPD=m ## @@ -1124,6 +1186,7 @@ CONFIG_PTP_1588_CLOCK_QORIQ=m CONFIG_PWM=y CONFIG_PWM_BCM2835=m CONFIG_PWM_CROS_EC=m +CONFIG_PWM_IMX27=m CONFIG_PWM_MESON=m CONFIG_PWM_ROCKCHIP=m CONFIG_PWM_SL28CPLD=m @@ -1136,6 +1199,7 @@ CONFIG_PWM_TEGRA=m CONFIG_REGULATOR=y CONFIG_REGULATOR_FIXED_VOLTAGE=m CONFIG_REGULATOR_AXP20X=m +CONFIG_REGULATOR_BD718XX=m CONFIG_REGULATOR_FAN53555=m CONFIG_REGULATOR_GPIO=m CONFIG_REGULATOR_HI655X=m @@ -1146,6 +1210,7 @@ CONFIG_REGULATOR_PWM=m CONFIG_REGULATOR_QCOM_RPM=m CONFIG_REGULATOR_QCOM_SMD_RPM=m CONFIG_REGULATOR_QCOM_SPMI=m +CONFIG_REGULATOR_RASPBERRYPI_TOUCHSCREEN_ATTINY=m CONFIG_REGULATOR_RK808=m CONFIG_REGULATOR_VCTRL=m @@ -1160,6 +1225,7 @@ CONFIG_QCOM_Q6V5_MSS=m ## file: drivers/reset/Kconfig ## CONFIG_RESET_CONTROLLER=y +CONFIG_RESET_IMX7=m ## ## file: drivers/rpmsg/Kconfig @@ -1176,6 +1242,7 @@ CONFIG_RTC_DRV_MAX77686=y CONFIG_RTC_DRV_RK808=y CONFIG_RTC_DRV_PCF85063=y CONFIG_RTC_DRV_PCF8563=y +CONFIG_RTC_DRV_M41T80=m CONFIG_RTC_DRV_RV8803=m CONFIG_RTC_DRV_PCF2127=m CONFIG_RTC_DRV_EFI=y @@ -1186,6 +1253,7 @@ CONFIG_RTC_DRV_MV=m CONFIG_RTC_DRV_ARMADA38X=m CONFIG_RTC_DRV_PM8XXX=m CONFIG_RTC_DRV_TEGRA=y +CONFIG_RTC_DRV_SNVS=m CONFIG_RTC_DRV_XGENE=y ## @@ -1238,7 +1306,9 @@ CONFIG_ARCH_TEGRA_210_SOC=y CONFIG_SPI_ARMADA_3700=m CONFIG_SPI_BCM2835=m CONFIG_SPI_BCM2835AUX=m +CONFIG_SPI_FSL_QUADSPI=m CONFIG_SPI_NXP_FLEXSPI=m +CONFIG_SPI_IMX=m CONFIG_SPI_FSL_DSPI=m CONFIG_SPI_MESON_SPICC=m CONFIG_SPI_MESON_SPIFC=m @@ -1262,6 +1332,16 @@ CONFIG_SPMI_MSM_PMIC_ARB=y CONFIG_STAGING_MEDIA=y ## +## file: drivers/staging/media/hantro/Kconfig +## +CONFIG_VIDEO_HANTRO=m + +## +## file: drivers/staging/media/rkvdec/Kconfig +## +CONFIG_VIDEO_ROCKCHIP_VDEC=m + +## ## file: drivers/staging/media/sunxi/Kconfig ## CONFIG_VIDEO_SUNXI=y @@ -1439,6 +1519,7 @@ CONFIG_TYPEC=m ## file: drivers/usb/typec/tcpm/Kconfig ## CONFIG_TYPEC_TCPM=m +CONFIG_TYPEC_TCPCI=m CONFIG_TYPEC_FUSB302=m ## @@ -1485,10 +1566,12 @@ CONFIG_ARM_SBSA_WATCHDOG=m CONFIG_ARMADA_37XX_WATCHDOG=m CONFIG_DW_WATCHDOG=m CONFIG_SUNXI_WATCHDOG=m +CONFIG_IMX2_WDT=m CONFIG_TEGRA_WATCHDOG=m CONFIG_QCOM_WDT=m CONFIG_MESON_GXBB_WATCHDOG=m CONFIG_MESON_WATCHDOG=m +CONFIG_I6300ESB_WDT=m CONFIG_BCM2835_WDT=m ## @@ -1547,14 +1630,18 @@ CONFIG_SND_BCM2835_SOC_I2S=m ## file: sound/soc/codecs/Kconfig ## CONFIG_SND_SOC_ES8316=m +CONFIG_SND_SOC_GTM601=m CONFIG_SND_SOC_RK3328=m +CONFIG_SND_SOC_SGTL5000=m CONFIG_SND_SOC_SIMPLE_AMPLIFIER=m +CONFIG_SND_SOC_SPDIF=m CONFIG_SND_SOC_WM8904=m ## ## file: sound/soc/fsl/Kconfig ## CONFIG_SND_SOC_FSL_SAI=m +CONFIG_SND_SOC_FSL_SPDIF=m ## ## file: sound/soc/generic/Kconfig @@ -1568,6 +1655,12 @@ CONFIG_SND_AUDIO_GRAPH_CARD=m CONFIG_SND_I2S_HI6210_I2S=m ## +## file: sound/soc/meson/Kconfig +## +CONFIG_SND_MESON_AXG_SOUND_CARD=m +CONFIG_SND_MESON_GX_SOUND_CARD=m + +## ## file: sound/soc/qcom/Kconfig ## CONFIG_SND_SOC_QCOM=m @@ -1578,6 +1671,7 @@ CONFIG_SND_SOC_APQ8016_SBC=m ## CONFIG_SND_SOC_ROCKCHIP=m CONFIG_SND_SOC_ROCKCHIP_I2S=m +CONFIG_SND_SOC_ROCKCHIP_PDM=m CONFIG_SND_SOC_ROCKCHIP_SPDIF=m CONFIG_SND_SOC_ROCKCHIP_RT5645=m CONFIG_SND_SOC_RK3399_GRU_SOUND=m @@ -1600,4 +1694,3 @@ CONFIG_SND_SOC_TEGRA_TRIMSLICE=m CONFIG_SND_SOC_TEGRA_ALC5632=m CONFIG_SND_SOC_TEGRA_MAX98090=m CONFIG_SND_SOC_TEGRA_RT5677=m - diff --git a/debian/config/arm64/config.cloud-arm64 b/debian/config/arm64/config.cloud-arm64 index 2c7a59e27..9091c3b39 100644 --- a/debian/config/arm64/config.cloud-arm64 +++ b/debian/config/arm64/config.cloud-arm64 @@ -110,4 +110,3 @@ CONFIG_POWER_SUPPLY_HWMON=y ## file: kernel/power/Kconfig ## CONFIG_PM=y - diff --git a/debian/config/arm64/defines b/debian/config/arm64/defines index a61d02a4b..dec69491a 100644 --- a/debian/config/arm64/defines +++ b/debian/config/arm64/defines @@ -24,5 +24,5 @@ hardware-long: cloud platforms supporting arm64 virtual machines [arm64_image] [relations] -gcc-10: gcc-10 <!stage1 !cross !pkg.linux.nokernel>, gcc-10-aarch64-linux-gnu <!stage1 cross !pkg.linux.nokernel>, gcc-arm-linux-gnueabihf <!stage1 !pkg.linux.nokernel> -headers%gcc-10: gcc-10 +gcc-11: gcc-11 <!stage1 !cross !pkg.linux.nokernel>, gcc-11-aarch64-linux-gnu <!stage1 cross !pkg.linux.nokernel>, gcc-arm-linux-gnueabihf <!stage1 !pkg.linux.nokernel> +headers%gcc-11: gcc-11 diff --git a/debian/config/armel/config b/debian/config/armel/config index 2cb39c7e3..b3859d5d3 100644 --- a/debian/config/armel/config +++ b/debian/config/armel/config @@ -12,4 +12,3 @@ ## file: security/tomoyo/Kconfig ## # CONFIG_SECURITY_TOMOYO is not set - diff --git a/debian/config/armel/config.marvell b/debian/config/armel/config.marvell index af1423f3e..2d502cc42 100644 --- a/debian/config/armel/config.marvell +++ b/debian/config/armel/config.marvell @@ -249,6 +249,7 @@ CONFIG_HWMON=m CONFIG_SENSORS_G760A=m CONFIG_SENSORS_G762=m CONFIG_SENSORS_GPIO_FAN=m +CONFIG_SENSORS_LM63=m CONFIG_SENSORS_LM75=m ## @@ -856,4 +857,3 @@ CONFIG_SND_KIRKWOOD_SOC=m # CONFIG_RD_LZMA is not set # CONFIG_RD_LZO is not set # CONFIG_RD_LZ4 is not set - diff --git a/debian/config/armel/config.rpi b/debian/config/armel/config.rpi index 7fc1158b0..416bb1ba6 100644 --- a/debian/config/armel/config.rpi +++ b/debian/config/armel/config.rpi @@ -17,6 +17,26 @@ CONFIG_ARCH_BCM2835=y CONFIG_BT_HCIUART=m ## +## file: drivers/clk/bcm/Kconfig +## +CONFIG_CLK_RASPBERRYPI=y + +## +## file: drivers/cpufreq/Kconfig +## +## choice: Default CPUFreq governor +# CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set +CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y +## end choice +CONFIG_CPU_FREQ_GOV_ONDEMAND=y +CONFIG_CPUFREQ_DT=m + +## +## file: drivers/cpufreq/Kconfig.arm +## +CONFIG_ARM_RASPBERRYPI_CPUFREQ=m + +## ## file: drivers/dma/Kconfig ## CONFIG_DMADEVICES=y @@ -192,4 +212,3 @@ CONFIG_SND_SOC=y ## file: sound/soc/bcm/Kconfig ## CONFIG_SND_BCM2835_SOC_I2S=y - diff --git a/debian/config/armel/defines b/debian/config/armel/defines index 8194b3af9..031067afd 100644 --- a/debian/config/armel/defines +++ b/debian/config/armel/defines @@ -13,15 +13,15 @@ uncompressed-image-file: arch/arm/boot/Image install-stem: vmlinuz [relations] -headers%gcc-10: linux-compiler-gcc-10-arm +headers%gcc-11: linux-compiler-gcc-11-arm [marvell_description] hardware: Marvell Kirkwood/Orion hardware-long: Marvell Kirkwood and Orion based systems (https://wiki.debian.org/ArmEabiPort#Supported_hardware) [rpi_description] -hardware: Raspberry Pi and Pi Zero -hardware-long: Raspberry Pi, Raspberry Pi Zero based systems +hardware: Raspberry Pi Zero, Zero W and 1 +hardware-long: Raspberry Pi Zero, Zero W and 1 based systems [marvell_image] recommends: u-boot-tools diff --git a/debian/config/armhf/config b/debian/config/armhf/config index aa78d6e0a..64122e618 100644 --- a/debian/config/armhf/config +++ b/debian/config/armhf/config @@ -37,6 +37,20 @@ CONFIG_KERNEL_MODE_NEON=y # CONFIG_DEBUG_LL is not set ## +## file: arch/arm/crypto/Kconfig +## +CONFIG_CRYPTO_SHA1_ARM_NEON=m +CONFIG_CRYPTO_SHA1_ARM_CE=m +CONFIG_CRYPTO_SHA2_ARM_CE=m +CONFIG_CRYPTO_BLAKE2B_NEON=m +CONFIG_CRYPTO_AES_ARM_BS=m +CONFIG_CRYPTO_AES_ARM_CE=m +CONFIG_CRYPTO_GHASH_ARM_CE=m +CONFIG_CRYPTO_CRCT10DIF_ARM_CE=m +CONFIG_CRYPTO_CRC32_ARM_CE=m +CONFIG_CRYPTO_NHPOLY1305_NEON=m + +## ## file: arch/arm/mach-aspeed/Kconfig ## CONFIG_ARCH_ASPEED=y @@ -83,6 +97,7 @@ CONFIG_ARCH_MESON=y ## CONFIG_ARCH_MMP=y CONFIG_MACH_MMP2_DT=y +CONFIG_MACH_MMP3_DT=y ## ## file: arch/arm/mach-mvebu/Kconfig @@ -224,6 +239,7 @@ CONFIG_COMMON_CLK_SI5351=m CONFIG_COMMON_CLK_ASPEED=y CONFIG_COMMON_CLK_S2MPS11=m CONFIG_CLK_TWL6040=m +CONFIG_COMMON_CLK_MMP2_AUDIO=m ## ## file: drivers/clk/bcm/Kconfig @@ -376,6 +392,7 @@ CONFIG_DRM_TI_TPD12S015=m ## file: drivers/gpu/drm/bridge/synopsys/Kconfig ## CONFIG_DRM_DW_HDMI_AHB_AUDIO=m +CONFIG_DRM_DW_HDMI_I2S_AUDIO=m CONFIG_DRM_DW_HDMI_CEC=m ## @@ -630,6 +647,7 @@ CONFIG_TEGRA_IOMMU_SMMU=y ## file: drivers/leds/Kconfig ## CONFIG_LEDS_CLASS=y +CONFIG_LEDS_ARIEL=m CONFIG_LEDS_LP5523=m CONFIG_LEDS_PCA963X=m CONFIG_LEDS_DA9052=m @@ -723,6 +741,7 @@ CONFIG_MFD_AXP20X_RSB=y # CONFIG_MFD_CROS_EC_DEV is not set CONFIG_MFD_DA9052_SPI=y CONFIG_MFD_DA9052_I2C=y +CONFIG_MFD_ENE_KB3930=m CONFIG_MFD_MC13XXX_SPI=m CONFIG_MFD_MC13XXX_I2C=m CONFIG_MFD_MAX77686=y @@ -1044,6 +1063,7 @@ CONFIG_PHY_SUN9I_USB=m ## CONFIG_PHY_MVEBU_A38X_COMPHY=m CONFIG_PHY_PXA_USB=m +CONFIG_PHY_MMP3_USB=m ## ## file: drivers/phy/rockchip/Kconfig @@ -1599,6 +1619,16 @@ CONFIG_SND_AUDIO_GRAPH_CARD=m CONFIG_SND_KIRKWOOD_SOC=m ## +## file: sound/soc/meson/Kconfig +## +CONFIG_SND_MESON_GX_SOUND_CARD=m + +## +## file: sound/soc/pxa/Kconfig +## +CONFIG_SND_MMP_SOC_SSPA=m + +## ## file: sound/soc/rockchip/Kconfig ## CONFIG_SND_SOC_ROCKCHIP=m @@ -1621,6 +1651,7 @@ CONFIG_SND_SOC_STM32_DFSDM=m CONFIG_SND_SUN4I_CODEC=m CONFIG_SND_SUN8I_CODEC=m CONFIG_SND_SUN8I_CODEC_ANALOG=m +CONFIG_SND_SUN4I_I2S=m CONFIG_SND_SUN4I_SPDIF=m ## @@ -1643,4 +1674,3 @@ CONFIG_SND_SOC_NOKIA_RX51=m CONFIG_SND_SOC_OMAP3_PANDORA=m CONFIG_SND_SOC_OMAP3_TWL4030=m CONFIG_SND_SOC_OMAP_ABE_TWL6040=m - diff --git a/debian/config/armhf/config.armmp-lpae b/debian/config/armhf/config.armmp-lpae index 6eaad12bd..a23edc64b 100644 --- a/debian/config/armhf/config.armmp-lpae +++ b/debian/config/armhf/config.armmp-lpae @@ -12,4 +12,3 @@ CONFIG_ARM_LPAE=y ## file: drivers/iommu/Kconfig ## CONFIG_ARM_SMMU=y - diff --git a/debian/config/armhf/defines b/debian/config/armhf/defines index 1050d8212..7607ea2d9 100644 --- a/debian/config/armhf/defines +++ b/debian/config/armhf/defines @@ -12,7 +12,7 @@ vdso: true install-stem: vmlinuz [relations] -headers%gcc-10: linux-compiler-gcc-10-arm +headers%gcc-11: linux-compiler-gcc-11-arm [armmp_description] hardware: ARMv7 multiplatform compatible SoCs diff --git a/debian/config/config b/debian/config/config index da0eddf63..cf8f8fd99 100644 --- a/debian/config/config +++ b/debian/config/config @@ -14,12 +14,10 @@ CONFIG_STRICT_MODULE_RWX=y ## file: block/Kconfig ## CONFIG_BLOCK=y -CONFIG_BLK_DEV_BSG=y CONFIG_BLK_DEV_INTEGRITY=y CONFIG_BLK_DEV_ZONED=y CONFIG_BLK_DEV_THROTTLING=y # CONFIG_BLK_DEV_THROTTLING_LOW is not set -# CONFIG_BLK_CMDLINE_PARSER is not set CONFIG_BLK_WBT=y CONFIG_BLK_WBT_MQ=y # CONFIG_BLK_CGROUP_IOLATENCY is not set @@ -421,6 +419,7 @@ CONFIG_BT_HCIUART_LL=y CONFIG_BT_HCIUART_3WIRE=y CONFIG_BT_HCIUART_INTEL=y CONFIG_BT_HCIUART_BCM=y +CONFIG_BT_HCIBTUSB_MTK=y CONFIG_BT_HCIUART_RTL=y CONFIG_BT_HCIUART_QCA=y CONFIG_BT_HCIUART_AG6XX=y @@ -436,6 +435,12 @@ CONFIG_BT_MTKUART=m # CONFIG_OMAP_OCP2SCP is not set ## +## file: drivers/bus/mhi/Kconfig +## +CONFIG_MHI_BUS=m +CONFIG_MHI_BUS_PCI_GENERIC=m + +## ## file: drivers/char/Kconfig ## CONFIG_TTY_PRINTK=m @@ -734,11 +739,6 @@ CONFIG_DRM_AMD_DC_SI=y # CONFIG_DRM_AST is not set ## -## file: drivers/gpu/drm/bochs/Kconfig -## -CONFIG_DRM_BOCHS=m - -## ## file: drivers/gpu/drm/bridge/Kconfig ## # CONFIG_DRM_NXP_PTN3460 is not set @@ -823,6 +823,7 @@ CONFIG_DRM_QXL=m ## ## file: drivers/gpu/drm/tiny/Kconfig ## +CONFIG_DRM_BOCHS=m CONFIG_DRM_CIRRUS_QEMU=m ## @@ -1983,11 +1984,6 @@ CONFIG_LEDS_TRIGGER_NETDEV=m CONFIG_LEDS_TRIGGER_PATTERN=m ## -## file: drivers/lightnvm/Kconfig -## -# CONFIG_NVM is not set - -## ## file: drivers/mailbox/Kconfig ## # CONFIG_MAILBOX is not set @@ -2034,7 +2030,7 @@ CONFIG_DM_UEVENT=y CONFIG_DM_FLAKEY=m CONFIG_DM_VERITY=m CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG=y -# CONFIG_DM_VERITY_FEC is not set +CONFIG_DM_VERITY_FEC=y CONFIG_DM_SWITCH=m CONFIG_DM_LOG_WRITES=m CONFIG_DM_INTEGRITY=m @@ -3107,6 +3103,7 @@ CONFIG_VIRTIO_NET=m CONFIG_NLMON=m CONFIG_NET_VRF=m CONFIG_VSOCKMON=m +CONFIG_MHI_NET=m CONFIG_XEN_NETDEV_FRONTEND=m CONFIG_XEN_NETDEV_BACKEND=m # CONFIG_VMXNET3 is not set @@ -3714,7 +3711,7 @@ CONFIG_NET_VENDOR_WIZNET=y ## ## file: drivers/net/ethernet/xscale/Kconfig ## -CONFIG_PTP_1588_CLOCK_IXP46X=m +CONFIG_PTP_1588_CLOCK_IXP46X=y ## ## file: drivers/net/fddi/Kconfig @@ -4098,7 +4095,6 @@ CONFIG_IWLMVM=m ## file: drivers/net/wireless/intersil/Kconfig ## CONFIG_WLAN_VENDOR_INTERSIL=y -# CONFIG_PRISM54 is not set ## ## file: drivers/net/wireless/intersil/hostap/Kconfig @@ -4301,6 +4297,12 @@ CONFIG_ZD1211RW=m # CONFIG_ZD1211RW_DEBUG is not set ## +## file: drivers/net/wwan/Kconfig +## +CONFIG_WWAN=m +CONFIG_MHI_WWAN_CTRL=m + +## ## file: drivers/nfc/Kconfig ## # CONFIG_NFC_TRF7970A is not set @@ -4567,7 +4569,13 @@ CONFIG_PPS_CLIENT_PARPORT=m ## file: drivers/ptp/Kconfig ## CONFIG_PTP_1588_CLOCK=m +CONFIG_PTP_1588_CLOCK_DTE=m CONFIG_PTP_1588_CLOCK_QORIQ=m +CONFIG_DP83640_PHY=m +CONFIG_PTP_1588_CLOCK_INES=m +CONFIG_PTP_1588_CLOCK_IDT82P33=m +CONFIG_PTP_1588_CLOCK_IDTCM=m +CONFIG_PTP_1588_CLOCK_OCP=m ## ## file: drivers/pwm/Kconfig @@ -4712,6 +4720,7 @@ CONFIG_BLK_DEV_SD=m CONFIG_CHR_DEV_ST=m CONFIG_BLK_DEV_SR=m CONFIG_CHR_DEV_SG=m +CONFIG_BLK_DEV_BSG=y CONFIG_CHR_DEV_SCH=m CONFIG_SCSI_ENCLOSURE=m CONFIG_SCSI_CONSTANTS=y @@ -5047,7 +5056,7 @@ CONFIG_DVB_SP8870=m CONFIG_QLGE=m ## -## file: drivers/staging/rtl8188eu/Kconfig +## file: drivers/staging/r8188eu/Kconfig ## CONFIG_R8188EU=m CONFIG_88EU_AP_MODE=y @@ -5497,6 +5506,7 @@ CONFIG_USB_SERIAL_WISHBONE=m CONFIG_USB_SERIAL_SSU100=m CONFIG_USB_SERIAL_QT2=m CONFIG_USB_SERIAL_UPD78F0730=m +CONFIG_USB_SERIAL_XR=m CONFIG_USB_SERIAL_DEBUG=m ## @@ -5750,7 +5760,6 @@ CONFIG_XEN_MCE_LOG=y ## CONFIG_FS_DAX=y CONFIG_FILE_LOCKING=y -CONFIG_MANDATORY_FILE_LOCKING=y CONFIG_TMPFS=y CONFIG_TMPFS_POSIX_ACL=y CONFIG_TMPFS_INODE64=y @@ -5825,7 +5834,6 @@ CONFIG_BTRFS_FS_POSIX_ACL=y ## CONFIG_CACHEFILES=m # CONFIG_CACHEFILES_DEBUG is not set -# CONFIG_CACHEFILES_HISTOGRAM is not set ## ## file: fs/ceph/Kconfig @@ -5838,7 +5846,6 @@ CONFIG_CEPH_FS_POSIX_ACL=y ## file: fs/cifs/Kconfig ## CONFIG_CIFS=m -CONFIG_CIFS_WEAK_PW_HASH=y CONFIG_CIFS_UPCALL=y CONFIG_CIFS_XATTR=y CONFIG_CIFS_POSIX=y @@ -5932,9 +5939,9 @@ CONFIG_F2FS_FS_SECURITY=y # CONFIG_F2FS_FAULT_INJECTION is not set CONFIG_F2FS_FS_COMPRESSION=y CONFIG_F2FS_FS_LZO=y +CONFIG_F2FS_FS_LZORLE=y CONFIG_F2FS_FS_LZ4=y CONFIG_F2FS_FS_ZSTD=y -CONFIG_F2FS_FS_LZORLE=y ## ## file: fs/fat/Kconfig @@ -5958,9 +5965,7 @@ CONFIG_VXFS_FS=m ## CONFIG_FSCACHE=m CONFIG_FSCACHE_STATS=y -# CONFIG_FSCACHE_HISTOGRAM is not set # CONFIG_FSCACHE_DEBUG is not set -# CONFIG_FSCACHE_OBJECT_LIST is not set ## ## file: fs/fuse/Kconfig @@ -6529,8 +6534,8 @@ CONFIG_RCU_CPU_STALL_TIMEOUT=21 ## ## choice: Timer tick handling # CONFIG_HZ_PERIODIC is not set -CONFIG_NO_HZ_IDLE=y -# CONFIG_NO_HZ_FULL is not set +# CONFIG_NO_HZ_IDLE is not set +CONFIG_NO_HZ_FULL=y ## end choice #. Backward compatibility symbol # CONFIG_NO_HZ is not set @@ -6644,7 +6649,6 @@ CONFIG_DEBUG_LIST=y CONFIG_BUG_ON_DATA_CORRUPTION=y # CONFIG_DEBUG_CREDENTIALS is not set # CONFIG_DEBUG_WQ_FORCE_RR_CPU is not set -# CONFIG_DEBUG_BLOCK_EXT_DEVT is not set # CONFIG_CPU_HOTPLUG_STATE_CONTROL is not set # CONFIG_LATENCYTOP is not set CONFIG_STRICT_DEVMEM=y @@ -6779,7 +6783,8 @@ CONFIG_PAGE_POISONING=y CONFIG_NET=y CONFIG_INET=y CONFIG_NETWORK_SECMARK=y -# CONFIG_NETWORK_PHY_TIMESTAMPING is not set +CONFIG_NET_PTP_CLASSIFY=y +CONFIG_NETWORK_PHY_TIMESTAMPING=y CONFIG_NETFILTER=y CONFIG_NETFILTER_ADVANCED=y CONFIG_BRIDGE_NETFILTER=y @@ -7630,7 +7635,7 @@ CONFIG_TIPC_DIAG=m ## ## file: net/tls/Kconfig ## -# CONFIG_TLS is not set +CONFIG_TLS=m ## ## file: net/unix/Kconfig @@ -8157,4 +8162,3 @@ CONFIG_RD_LZMA=y CONFIG_RD_XZ=y CONFIG_RD_LZO=y CONFIG_RD_LZ4=y - diff --git a/debian/config/config.cloud b/debian/config/config.cloud index bdd1c4cd0..31fba49ea 100644 --- a/debian/config/config.cloud +++ b/debian/config/config.cloud @@ -467,11 +467,6 @@ CONFIG_HYPERV_KEYBOARD=m # CONFIG_NEW_LEDS is not set ## -## file: drivers/lightnvm/Kconfig -## -# CONFIG_NVM is not set - -## ## file: drivers/macintosh/Kconfig ## # CONFIG_MACINTOSH_DRIVERS is not set @@ -1702,4 +1697,3 @@ CONFIG_SECURITY_INFINIBAND=y ## file: sound/Kconfig ## # CONFIG_SOUND is not set - diff --git a/debian/config/defines b/debian/config/defines index 41b1e59f3..11db9415d 100644 --- a/debian/config/defines +++ b/debian/config/defines @@ -142,7 +142,7 @@ arches: sparc sparc64 x32 -compiler: gcc-10 +compiler: gcc-11 featuresets: none rt @@ -154,7 +154,7 @@ signed-code: false trusted-certs: debian/certs/debian-uefi-certs.pem [featureset-rt_base] -enabled: false +enabled: true [description] part-long-up: This kernel is not suitable for SMP (multi-processor, @@ -167,7 +167,7 @@ recommends: apparmor [relations] # compilers -gcc-10: gcc-10 <!stage1 !cross !pkg.linux.nokernel>, gcc-10-@gnu-type-package@ <!stage1 cross !pkg.linux.nokernel> +gcc-11: gcc-11 <!stage1 !cross !pkg.linux.nokernel>, gcc-11-@gnu-type-package@ <!stage1 cross !pkg.linux.nokernel> # initramfs-generators initramfs-fallback: linux-initramfs-tool diff --git a/debian/config/hppa/config b/debian/config/hppa/config index 90e0e6645..383fcfbe6 100644 --- a/debian/config/hppa/config +++ b/debian/config/hppa/config @@ -209,7 +209,7 @@ CONFIG_B44=m ## ## file: drivers/net/ethernet/cirrus/Kconfig ## -CONFIG_CS89x0=m +CONFIG_CS89x0_ISA=m ## ## file: drivers/net/ethernet/dec/tulip/Kconfig @@ -662,4 +662,3 @@ CONFIG_SND_HARMONY=m ## file: sound/pci/hda/Kconfig ## # CONFIG_SND_HDA_INTEL is not set - diff --git a/debian/config/hppa/config.parisc b/debian/config/hppa/config.parisc index ec658ba1a..5ca2deb2c 100644 --- a/debian/config/hppa/config.parisc +++ b/debian/config/hppa/config.parisc @@ -24,4 +24,3 @@ CONFIG_TLAN=m ## CONFIG_PCMCIA_AHA152X=m CONFIG_PCMCIA_NINJA_SCSI=m - diff --git a/debian/config/hppa/config.parisc64 b/debian/config/hppa/config.parisc64 index aae1ef937..5870d9696 100644 --- a/debian/config/hppa/config.parisc64 +++ b/debian/config/hppa/config.parisc64 @@ -54,4 +54,3 @@ CONFIG_I2C_ALGOBIT=y # CONFIG_FLATMEM_MANUAL is not set # CONFIG_SPARSEMEM_MANUAL is not set ## end choice - diff --git a/debian/config/hppa/defines b/debian/config/hppa/defines index 2dc7853a3..a2ee28276 100644 --- a/debian/config/hppa/defines +++ b/debian/config/hppa/defines @@ -25,5 +25,5 @@ hardware: 64-bit PA-RISC hardware-long: HP PA-RISC 64-bit systems with support for more than 4 GB RAM [relations] -gcc-10: gcc-10 <!stage1 !cross !pkg.linux.nokernel>, gcc-10-hppa-linux-gnu <!stage1 cross !pkg.linux.nokernel>, binutils-hppa64-linux-gnu <!stage1 !pkg.linux.nokernel>, gcc-10-hppa64-linux-gnu <!stage1 !pkg.linux.nokernel> +gcc-11: gcc-11 <!stage1 !cross !pkg.linux.nokernel>, gcc-11-hppa-linux-gnu <!stage1 cross !pkg.linux.nokernel>, binutils-hppa64-linux-gnu <!stage1 !pkg.linux.nokernel>, gcc-11-hppa64-linux-gnu <!stage1 !pkg.linux.nokernel> diff --git a/debian/config/i386/config b/debian/config/i386/config index 1d63bdd73..85cfdfb53 100644 --- a/debian/config/i386/config +++ b/debian/config/i386/config @@ -308,7 +308,7 @@ CONFIG_NI65=m ## file: drivers/net/ethernet/cirrus/Kconfig ## CONFIG_NET_VENDOR_CIRRUS=y -CONFIG_CS89x0=m +CONFIG_CS89x0_ISA=m ## ## file: drivers/net/ethernet/dec/Kconfig @@ -504,4 +504,3 @@ CONFIG_SND_MSND_CLASSIC=m CONFIG_SND_CS5530=m CONFIG_SND_CS5535AUDIO=m CONFIG_SND_SIS7019=m - diff --git a/debian/config/i386/config.686 b/debian/config/i386/config.686 index 80b48b5fe..13b0e7b21 100644 --- a/debian/config/i386/config.686 +++ b/debian/config/i386/config.686 @@ -80,4 +80,3 @@ CONFIG_FB_GEODE_GX1=m ## file: lib/Kconfig.debug ## # CONFIG_DEBUG_HIGHMEM is not set - diff --git a/debian/config/i386/config.686-pae b/debian/config/i386/config.686-pae index 64db14c27..22cce6e92 100644 --- a/debian/config/i386/config.686-pae +++ b/debian/config/i386/config.686-pae @@ -50,4 +50,3 @@ CONFIG_I2C_STUB=m ## file: lib/Kconfig.debug ## # CONFIG_DEBUG_HIGHMEM is not set - diff --git a/debian/config/i386/defines b/debian/config/i386/defines index f86e91d61..d471d747d 100644 --- a/debian/config/i386/defines +++ b/debian/config/i386/defines @@ -21,7 +21,7 @@ install-stem: vmlinuz breaks: xserver-xorg-input-vmmouse (<< 1:13.0.99) [relations] -headers%gcc-10: linux-compiler-gcc-10-x86 +headers%gcc-11: linux-compiler-gcc-11-x86 [686_description] hardware: older PCs diff --git a/debian/config/ia64/config b/debian/config/ia64/config index 9235c6688..5934a7d28 100644 --- a/debian/config/ia64/config +++ b/debian/config/ia64/config @@ -733,4 +733,3 @@ CONFIG_SND_YMFPCI=m ## file: sound/pci/hda/Kconfig ## CONFIG_SND_HDA_INTEL=m - diff --git a/debian/config/ia64/config.itanium b/debian/config/ia64/config.itanium index e683b846b..1d1da671f 100644 --- a/debian/config/ia64/config.itanium +++ b/debian/config/ia64/config.itanium @@ -8,4 +8,3 @@ CONFIG_ITANIUM=y CONFIG_SMP=y CONFIG_NR_CPUS=64 # CONFIG_SCHED_SMT is not set - diff --git a/debian/config/ia64/config.mckinley b/debian/config/ia64/config.mckinley index 6da85354e..581fcde88 100644 --- a/debian/config/ia64/config.mckinley +++ b/debian/config/ia64/config.mckinley @@ -8,4 +8,3 @@ CONFIG_MCKINLEY=y CONFIG_SMP=y CONFIG_NR_CPUS=64 # CONFIG_SCHED_SMT is not set - diff --git a/debian/config/kernelarch-arm/config b/debian/config/kernelarch-arm/config index f176eb79c..e47a9edd2 100644 --- a/debian/config/kernelarch-arm/config +++ b/debian/config/kernelarch-arm/config @@ -24,8 +24,10 @@ CONFIG_EARLY_PRINTK=y ## file: arch/arm/crypto/Kconfig ## CONFIG_CRYPTO_SHA1_ARM=m +CONFIG_CRYPTO_SHA256_ARM=m +CONFIG_CRYPTO_SHA512_ARM=m +CONFIG_CRYPTO_BLAKE2S_ARM=m CONFIG_CRYPTO_AES_ARM=m -CONFIG_CRYPTO_NHPOLY1305_NEON=m ## ## file: arch/arm/mm/Kconfig @@ -63,6 +65,11 @@ CONFIG_MOUSE_APPLETOUCH=m # CONFIG_TOUCHSCREEN_EETI is not set ## +## file: drivers/leds/trigger/Kconfig +## +CONFIG_LEDS_TRIGGER_HEARTBEAT=y + +## ## file: drivers/mtd/maps/Kconfig ## CONFIG_MTD_PHYSMAP=y @@ -132,4 +139,3 @@ CONFIG_CPU_THERMAL=y ## CONFIG_XZ_DEC_ARM=y CONFIG_XZ_DEC_ARMTHUMB=y - diff --git a/debian/config/kernelarch-mips/config b/debian/config/kernelarch-mips/config index 730c5b274..8f02e6a85 100644 --- a/debian/config/kernelarch-mips/config +++ b/debian/config/kernelarch-mips/config @@ -80,4 +80,3 @@ CONFIG_USB_OHCI_HCD=m ## CONFIG_SGETMASK_SYSCALL=y CONFIG_SYSFS_SYSCALL=y - diff --git a/debian/config/kernelarch-mips/config.boston b/debian/config/kernelarch-mips/config.boston index 4e860569d..1cbeab204 100644 --- a/debian/config/kernelarch-mips/config.boston +++ b/debian/config/kernelarch-mips/config.boston @@ -66,4 +66,3 @@ CONFIG_SPI_TOPCLIFF_PCH=y ## file: drivers/tty/serial/8250/Kconfig ## CONFIG_SERIAL_OF_PLATFORM=y - diff --git a/debian/config/kernelarch-mips/config.loongson-3 b/debian/config/kernelarch-mips/config.loongson-3 index 8f311cbf7..7def9a546 100644 --- a/debian/config/kernelarch-mips/config.loongson-3 +++ b/debian/config/kernelarch-mips/config.loongson-3 @@ -139,4 +139,3 @@ CONFIG_PREEMPT=y ## file: sound/pci/hda/Kconfig ## CONFIG_SND_HDA_INTEL=m - diff --git a/debian/config/kernelarch-mips/config.malta b/debian/config/kernelarch-mips/config.malta index 9099f4290..23ababc46 100644 --- a/debian/config/kernelarch-mips/config.malta +++ b/debian/config/kernelarch-mips/config.malta @@ -487,4 +487,3 @@ CONFIG_SND_VIA82XX=m CONFIG_SND_VIA82XX_MODEM=m CONFIG_SND_VX222=m CONFIG_SND_YMFPCI=m - diff --git a/debian/config/kernelarch-mips/config.mips32r2 b/debian/config/kernelarch-mips/config.mips32r2 index 88f2c05a9..f8de842c7 100644 --- a/debian/config/kernelarch-mips/config.mips32r2 +++ b/debian/config/kernelarch-mips/config.mips32r2 @@ -7,4 +7,3 @@ CONFIG_CPU_MIPS32_R2=y ## choice: Kernel code model CONFIG_32BIT=y ## end choice - diff --git a/debian/config/kernelarch-mips/config.mips32r6 b/debian/config/kernelarch-mips/config.mips32r6 index c95ffabb9..1771b56be 100644 --- a/debian/config/kernelarch-mips/config.mips32r6 +++ b/debian/config/kernelarch-mips/config.mips32r6 @@ -7,4 +7,3 @@ CONFIG_CPU_MIPS32_R6=y ## choice: Kernel code model CONFIG_32BIT=y ## end choice - diff --git a/debian/config/kernelarch-mips/config.mips64r2 b/debian/config/kernelarch-mips/config.mips64r2 index 1c1bed181..99507b262 100644 --- a/debian/config/kernelarch-mips/config.mips64r2 +++ b/debian/config/kernelarch-mips/config.mips64r2 @@ -7,4 +7,3 @@ CONFIG_CPU_MIPS64_R2=y ## choice: Kernel code model CONFIG_64BIT=y ## end choice - diff --git a/debian/config/kernelarch-mips/config.mips64r6 b/debian/config/kernelarch-mips/config.mips64r6 index 2cd32b0b5..1f72d372f 100644 --- a/debian/config/kernelarch-mips/config.mips64r6 +++ b/debian/config/kernelarch-mips/config.mips64r6 @@ -7,4 +7,3 @@ CONFIG_CPU_MIPS64_R6=y ## choice: Kernel code model CONFIG_64BIT=y ## end choice - diff --git a/debian/config/kernelarch-mips/config.octeon b/debian/config/kernelarch-mips/config.octeon index d4d9d543b..ec2327ed2 100644 --- a/debian/config/kernelarch-mips/config.octeon +++ b/debian/config/kernelarch-mips/config.octeon @@ -186,4 +186,3 @@ CONFIG_USB_OCTEON_OHCI=y ## choice: Memory model CONFIG_SPARSEMEM_MANUAL=y ## end choice - diff --git a/debian/config/kernelarch-powerpc/config b/debian/config/kernelarch-powerpc/config index 1abc7ffe3..ede6f2c68 100644 --- a/debian/config/kernelarch-powerpc/config +++ b/debian/config/kernelarch-powerpc/config @@ -907,4 +907,3 @@ CONFIG_SND_HDA_INTEL=m ## CONFIG_SND_POWERMAC=m CONFIG_SND_POWERMAC_AUTO_DRC=y - diff --git a/debian/config/kernelarch-powerpc/config-arch-64 b/debian/config/kernelarch-powerpc/config-arch-64 index a247b9dfc..a24cd6067 100644 --- a/debian/config/kernelarch-powerpc/config-arch-64 +++ b/debian/config/kernelarch-powerpc/config-arch-64 @@ -205,4 +205,3 @@ CONFIG_CPUMASK_OFFSTACK=y CONFIG_SPARSEMEM_MANUAL=y ## end choice CONFIG_SPARSEMEM_VMEMMAP=y - diff --git a/debian/config/kernelarch-powerpc/config-arch-64-be b/debian/config/kernelarch-powerpc/config-arch-64-be index 20099b2fc..363d14c5b 100644 --- a/debian/config/kernelarch-powerpc/config-arch-64-be +++ b/debian/config/kernelarch-powerpc/config-arch-64-be @@ -108,4 +108,3 @@ CONFIG_SYSFS_SYSCALL=y ## CONFIG_SND_PS3=m CONFIG_SND_PS3_DEFAULT_START_DELAY=2000 - diff --git a/debian/config/kernelarch-powerpc/config-arch-64-le b/debian/config/kernelarch-powerpc/config-arch-64-le index 7cb371a69..6abccc58e 100644 --- a/debian/config/kernelarch-powerpc/config-arch-64-le +++ b/debian/config/kernelarch-powerpc/config-arch-64-le @@ -33,4 +33,3 @@ CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y ## #. See #789070 # CONFIG_HIBERNATION is not set - diff --git a/debian/config/kernelarch-sparc/config b/debian/config/kernelarch-sparc/config index cebab40be..022d612e8 100644 --- a/debian/config/kernelarch-sparc/config +++ b/debian/config/kernelarch-sparc/config @@ -602,4 +602,3 @@ CONFIG_SND_MAESTRO3=m CONFIG_SND_SUN_AMD7930=m CONFIG_SND_SUN_CS4231=m CONFIG_SND_SUN_DBRI=m - diff --git a/debian/config/kernelarch-sparc/config-smp b/debian/config/kernelarch-sparc/config-smp index f6412c26e..323cbd074 100644 --- a/debian/config/kernelarch-sparc/config-smp +++ b/debian/config/kernelarch-sparc/config-smp @@ -4,4 +4,3 @@ CONFIG_SMP=y CONFIG_NR_CPUS=256 CONFIG_SCHED_SMT=y - diff --git a/debian/config/kernelarch-sparc/config-up b/debian/config/kernelarch-sparc/config-up index 758621713..4ddea7054 100644 --- a/debian/config/kernelarch-sparc/config-up +++ b/debian/config/kernelarch-sparc/config-up @@ -2,4 +2,3 @@ ## file: arch/sparc/Kconfig ## # CONFIG_SMP is not set - diff --git a/debian/config/kernelarch-x86/config b/debian/config/kernelarch-x86/config index 1bf9c5593..8acb9af76 100644 --- a/debian/config/kernelarch-x86/config +++ b/debian/config/kernelarch-x86/config @@ -43,10 +43,6 @@ CONFIG_MICROCODE_AMD=y CONFIG_X86_MSR=m CONFIG_X86_CPUID=m CONFIG_AMD_MEM_ENCRYPT=y -#. Do not activate AMD Secure Memory Encryption (SME) by default -#. Until AMDGPU related incompatibilities are fixed, cf. #994453 -#. Can be activated by with the 'mem_encrypt=on' on the kernel command line. -# CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT is not set CONFIG_NODES_SHIFT=6 # CONFIG_ARCH_MEMORY_PROBE is not set CONFIG_X86_PMEM_LEGACY=m @@ -73,8 +69,6 @@ CONFIG_RANDOMIZE_BASE=y CONFIG_MODIFY_LDT_SYSCALL=y # CONFIG_PCI_CNB20LE_QUIRK is not set # CONFIG_ISA_BUS is not set -#. Doesn't support handover; see #822575 -# CONFIG_X86_SYSFB is not set CONFIG_IA32_EMULATION=y ## @@ -127,10 +121,10 @@ CONFIG_KVM_AMD_SEV=y ## CONFIG_XEN=y CONFIG_XEN_PV=y -CONFIG_XEN_DOM0=y CONFIG_XEN_PVHVM_GUEST=y # CONFIG_XEN_DEBUG_FS is not set CONFIG_XEN_PVH=y +CONFIG_XEN_DOM0=y ## ## file: block/partitions/Kconfig @@ -554,6 +548,8 @@ CONFIG_FIRMWARE_MEMMAP=y CONFIG_DMIID=y CONFIG_ISCSI_IBFT_FIND=y CONFIG_ISCSI_IBFT=y +#. Doesn't support handover; see #822575 +# CONFIG_SYSFB_SIMPLEFB is not set ## ## file: drivers/firmware/efi/Kconfig @@ -690,6 +686,7 @@ CONFIG_SENSORS_K8TEMP=m CONFIG_SENSORS_K10TEMP=m CONFIG_SENSORS_FAM15H_POWER=m CONFIG_SENSORS_ASB100=m +CONFIG_SENSORS_CORSAIR_PSU=m CONFIG_SENSORS_DS1621=m CONFIG_SENSORS_DELL_SMM=m CONFIG_SENSORS_F71805F=m @@ -1285,7 +1282,6 @@ CONFIG_LANMEDIA=m CONFIG_PCI200SYN=m CONFIG_WANXL=m CONFIG_FARSYNC=m -# CONFIG_SBNI is not set ## ## file: drivers/net/wireless/Kconfig @@ -1418,7 +1414,6 @@ CONFIG_CHROME_PLATFORMS=y CONFIG_CHROMEOS_LAPTOP=m CONFIG_CHROMEOS_PSTORE=m CONFIG_CROS_EC=m -CONFIG_CROS_EC_PROTO=y CONFIG_CROS_KBD_LED_BACKLIGHT=m ## @@ -1434,7 +1429,6 @@ CONFIG_SURFACE_PRO3_BUTTON=m CONFIG_X86_PLATFORM_DEVICES=y CONFIG_ACPI_WMI=m CONFIG_HUAWEI_WMI=m -CONFIG_INTEL_WMI_THUNDERBOLT=m CONFIG_PEAQ_WMI=m CONFIG_XIAOMI_WMI=m CONFIG_ACERHDF=m @@ -1463,12 +1457,6 @@ CONFIG_THINKPAD_ACPI_ALSA_SUPPORT=y # CONFIG_THINKPAD_ACPI_UNSAFE_LEDS is not set CONFIG_THINKPAD_ACPI_VIDEO=y CONFIG_THINKPAD_ACPI_HOTKEY_POLL=y -CONFIG_INTEL_ATOMISP2_PM=m -CONFIG_INTEL_HID_EVENT=m -CONFIG_INTEL_INT0002_VGPIO=m -# CONFIG_INTEL_MENLOW is not set -CONFIG_INTEL_OAKTRAIL=m -CONFIG_INTEL_VBTN=m CONFIG_MSI_LAPTOP=m CONFIG_MSI_WMI=m CONFIG_PCENGINES_APU2=m @@ -1487,10 +1475,6 @@ CONFIG_SONYPI_COMPAT=y CONFIG_TOPSTAR_LAPTOP=m CONFIG_I2C_MULTI_INSTANTIATE=m CONFIG_INTEL_IPS=m -CONFIG_INTEL_RST=m -CONFIG_INTEL_SMARTCONNECT=m -CONFIG_INTEL_TURBO_MAX_3=y -CONFIG_INTEL_PMC_CORE=m ## ## file: drivers/platform/x86/dell/Kconfig @@ -1510,11 +1494,37 @@ CONFIG_DELL_WMI_AIO=m CONFIG_DELL_WMI_LED=m ## +## file: drivers/platform/x86/intel/Kconfig +## +CONFIG_INTEL_HID_EVENT=m +CONFIG_INTEL_VBTN=m +CONFIG_INTEL_INT0002_VGPIO=m +CONFIG_INTEL_OAKTRAIL=m +CONFIG_INTEL_RST=m +CONFIG_INTEL_SMARTCONNECT=m +CONFIG_INTEL_TURBO_MAX_3=y + +## +## file: drivers/platform/x86/intel/atomisp2/Kconfig +## +CONFIG_INTEL_ATOMISP2_PM=m + +## ## file: drivers/platform/x86/intel/int33fe/Kconfig ## CONFIG_INTEL_CHT_INT33FE=m ## +## file: drivers/platform/x86/intel/pmc/Kconfig +## +CONFIG_INTEL_PMC_CORE=m + +## +## file: drivers/platform/x86/intel/wmi/Kconfig +## +CONFIG_INTEL_WMI_THUNDERBOLT=m + +## ## file: drivers/pnp/Kconfig ## CONFIG_PNP=y @@ -1538,6 +1548,7 @@ CONFIG_INTEL_RAPL=m ## CONFIG_PTP_1588_CLOCK_PCH=m CONFIG_PTP_1588_CLOCK_KVM=m +CONFIG_PTP_1588_CLOCK_VMW=m ## ## file: drivers/pwm/Kconfig @@ -1664,6 +1675,7 @@ CONFIG_INTEL_POWERCLAMP=m CONFIG_X86_PKG_TEMP_THERMAL=m CONFIG_INTEL_SOC_DTS_THERMAL=m CONFIG_INTEL_PCH_THERMAL=m +# CONFIG_INTEL_MENLOW is not set ## ## file: drivers/thermal/intel/int340x_thermal/Kconfig @@ -2119,4 +2131,3 @@ CONFIG_SND_SOC_SOF_HDA_AUDIO_CODEC=y ## CONFIG_SND_X86=y CONFIG_HDMI_LPE_AUDIO=m - diff --git a/debian/config/m68k/config b/debian/config/m68k/config index d36be411a..319c6928f 100644 --- a/debian/config/m68k/config +++ b/debian/config/m68k/config @@ -857,4 +857,3 @@ CONFIG_IPV6=m CONFIG_DMASOUND_ATARI=m CONFIG_DMASOUND_PAULA=m CONFIG_DMASOUND_Q40=m - diff --git a/debian/config/mips/config b/debian/config/mips/config index 5942c9ab9..1d9268526 100644 --- a/debian/config/mips/config +++ b/debian/config/mips/config @@ -5,4 +5,3 @@ CONFIG_CPU_BIG_ENDIAN=y # CONFIG_CPU_LITTLE_ENDIAN is not set ## end choice - diff --git a/debian/config/mips64/config b/debian/config/mips64/config index 5942c9ab9..1d9268526 100644 --- a/debian/config/mips64/config +++ b/debian/config/mips64/config @@ -5,4 +5,3 @@ CONFIG_CPU_BIG_ENDIAN=y # CONFIG_CPU_LITTLE_ENDIAN is not set ## end choice - diff --git a/debian/config/mips64el/config b/debian/config/mips64el/config index 4807b611e..9f3326ab0 100644 --- a/debian/config/mips64el/config +++ b/debian/config/mips64el/config @@ -9,4 +9,3 @@ CONFIG_CPU_LITTLE_ENDIAN=y # CONFIG_CPU_MIPS64_R1 is not set CONFIG_CPU_MIPS64_R2=y ## end choice - diff --git a/debian/config/mips64el/defines b/debian/config/mips64el/defines index fb25ca9e9..7e126c187 100644 --- a/debian/config/mips64el/defines +++ b/debian/config/mips64el/defines @@ -25,6 +25,7 @@ hardware: Loongson 3A/3B hardware-long: Loongson 3A or 3B based systems (e.g. from Loongson or Lemote) [loongson-3_image] +recommends: pmon-update configs: kernelarch-mips/config.loongson-3 [octeon_description] diff --git a/debian/config/mips64r6/config b/debian/config/mips64r6/config index 5942c9ab9..1d9268526 100644 --- a/debian/config/mips64r6/config +++ b/debian/config/mips64r6/config @@ -5,4 +5,3 @@ CONFIG_CPU_BIG_ENDIAN=y # CONFIG_CPU_LITTLE_ENDIAN is not set ## end choice - diff --git a/debian/config/mips64r6el/config b/debian/config/mips64r6el/config index 7f124deb6..bf12a2398 100644 --- a/debian/config/mips64r6el/config +++ b/debian/config/mips64r6el/config @@ -5,4 +5,3 @@ # CONFIG_CPU_BIG_ENDIAN is not set CONFIG_CPU_LITTLE_ENDIAN=y ## end choice - diff --git a/debian/config/mipsel/config b/debian/config/mipsel/config index 7f124deb6..bf12a2398 100644 --- a/debian/config/mipsel/config +++ b/debian/config/mipsel/config @@ -5,4 +5,3 @@ # CONFIG_CPU_BIG_ENDIAN is not set CONFIG_CPU_LITTLE_ENDIAN=y ## end choice - diff --git a/debian/config/mipsel/defines b/debian/config/mipsel/defines index 9fcf2d43b..2e2a966a3 100644 --- a/debian/config/mipsel/defines +++ b/debian/config/mipsel/defines @@ -35,6 +35,7 @@ hardware: Loongson 3A/3B hardware-long: Loongson 3A or 3B based systems (e.g. from Loongson or Lemote) [loongson-3_image] +recommends: pmon-update configs: kernelarch-mips/config.loongson-3 [octeon_description] diff --git a/debian/config/mipsr6/config b/debian/config/mipsr6/config index 5942c9ab9..1d9268526 100644 --- a/debian/config/mipsr6/config +++ b/debian/config/mipsr6/config @@ -5,4 +5,3 @@ CONFIG_CPU_BIG_ENDIAN=y # CONFIG_CPU_LITTLE_ENDIAN is not set ## end choice - diff --git a/debian/config/mipsr6el/config b/debian/config/mipsr6el/config index 7f124deb6..bf12a2398 100644 --- a/debian/config/mipsr6el/config +++ b/debian/config/mipsr6el/config @@ -5,4 +5,3 @@ # CONFIG_CPU_BIG_ENDIAN is not set CONFIG_CPU_LITTLE_ENDIAN=y ## end choice - diff --git a/debian/config/powerpc/config.powerpc b/debian/config/powerpc/config.powerpc index c2ef65280..9dc3c24df 100644 --- a/debian/config/powerpc/config.powerpc +++ b/debian/config/powerpc/config.powerpc @@ -108,4 +108,3 @@ CONFIG_SYSFS_SYSCALL=y CONFIG_FLATMEM_MANUAL=y # CONFIG_SPARSEMEM_MANUAL is not set ## end choice - diff --git a/debian/config/powerpc/config.powerpc-smp b/debian/config/powerpc/config.powerpc-smp index dfbaaf6d2..c49683792 100644 --- a/debian/config/powerpc/config.powerpc-smp +++ b/debian/config/powerpc/config.powerpc-smp @@ -3,4 +3,3 @@ ## CONFIG_SMP=y CONFIG_NR_CPUS=4 - diff --git a/debian/config/riscv64/config b/debian/config/riscv64/config index 30f0567d1..0d5792355 100644 --- a/debian/config/riscv64/config +++ b/debian/config/riscv64/config @@ -7,6 +7,7 @@ CONFIG_SECCOMP=y ## file: arch/riscv/Kconfig ## CONFIG_SMP=y +CONFIG_NUMA=y CONFIG_KEXEC=y ## @@ -34,6 +35,11 @@ CONFIG_DRM=m CONFIG_DRM_RADEON=m ## +## file: drivers/hwmon/Kconfig +## +CONFIG_SENSORS_LM90=m + +## ## file: drivers/mmc/Kconfig ## CONFIG_MMC=m @@ -122,4 +128,3 @@ CONFIG_USB_EHCI_HCD=m CONFIG_USB_EHCI_HCD_PLATFORM=m CONFIG_USB_OHCI_HCD=m CONFIG_USB_OHCI_HCD_PLATFORM=m - diff --git a/debian/config/s390x/config b/debian/config/s390x/config index 61d2e5057..6a7dbfd98 100644 --- a/debian/config/s390x/config +++ b/debian/config/s390x/config @@ -113,7 +113,6 @@ CONFIG_HOTPLUG_PCI_S390=y ## ## file: drivers/s390/block/Kconfig ## -CONFIG_BLK_DEV_XPRAM=m CONFIG_DCSSBLK=m CONFIG_DASD=m # CONFIG_DASD_PROFILE is not set @@ -221,4 +220,3 @@ CONFIG_AFIUCV=m ## file: net/llc/Kconfig ## # CONFIG_LLC2 is not set - diff --git a/debian/config/s390x/defines b/debian/config/s390x/defines index ec8825526..922bd1aca 100644 --- a/debian/config/s390x/defines +++ b/debian/config/s390x/defines @@ -13,7 +13,7 @@ bootloaders: s390-tools install-stem: vmlinuz [relations] -headers%gcc-10: linux-compiler-gcc-10-s390 +headers%gcc-11: linux-compiler-gcc-11-s390 [s390x_description] hardware: IBM zSeries diff --git a/debian/config/sh4/config b/debian/config/sh4/config index a4d6d7a70..20b09e3fe 100644 --- a/debian/config/sh4/config +++ b/debian/config/sh4/config @@ -64,4 +64,3 @@ CONFIG_HZ_250=y ## file: kernel/irq/Kconfig ## CONFIG_SPARSE_IRQ=y - diff --git a/debian/config/sh4/config.sh7751r b/debian/config/sh4/config.sh7751r index 0e19bdcd8..6d8381081 100644 --- a/debian/config/sh4/config.sh7751r +++ b/debian/config/sh4/config.sh7751r @@ -169,4 +169,3 @@ CONFIG_FB_SM501=y ## choice: Memory model CONFIG_FLATMEM_MANUAL=y ## end choice - diff --git a/debian/config/sh4/config.sh7785lcr b/debian/config/sh4/config.sh7785lcr index 9b96eb998..9c1740f64 100644 --- a/debian/config/sh4/config.sh7785lcr +++ b/debian/config/sh4/config.sh7785lcr @@ -236,4 +236,3 @@ CONFIG_SPARSEMEM_MANUAL=y ## end choice CONFIG_MIGRATION=y CONFIG_DEFAULT_MMAP_MIN_ADDR=4096 - diff --git a/debian/installer/modules/scsi-core-modules b/debian/installer/modules/scsi-core-modules index e0d1d8fca..6b95c3a8f 100644 --- a/debian/installer/modules/scsi-core-modules +++ b/debian/installer/modules/scsi-core-modules @@ -1,3 +1,4 @@ +scsi_common scsi_mod sd_mod scsi_transport_sas ? diff --git a/debian/installer/modules/scsi-modules b/debian/installer/modules/scsi-modules index 2c5fba05d..dfc866448 100644 --- a/debian/installer/modules/scsi-modules +++ b/debian/installer/modules/scsi-modules @@ -27,6 +27,7 @@ ses - tcm_qla2xxx - # Exclude common code packaged in {cdrom,scsi}-core-modules +scsi_common - scsi_mod - sd_mod - sr_mod - diff --git a/debian/libcpupower1.symbols b/debian/libcpupower1.symbols index 5639cbe8a..5e98bb32d 100644 --- a/debian/libcpupower1.symbols +++ b/debian/libcpupower1.symbols @@ -37,4 +37,6 @@ libcpupower.so.1 libcpupower1 #MINVER# cpuidle_state_time@Base 4.7~rc2-1~exp1 cpuidle_state_usage@Base 4.7~rc2-1~exp1 cpupower_is_cpu_online@Base 4.7~rc2-1~exp1 + cpupower_read_sysfs@Base 5.13.9-1~exp1 + cpupower_write_sysfs@Base 5.13.9-1~exp1 get_cpu_topology@Base 4.7~rc2-1~exp1 diff --git a/debian/patches-rt/0001-gen_stats-Add-instead-Set-the-value-in-__gnet_stats_.patch b/debian/patches-rt/0001-gen_stats-Add-instead-Set-the-value-in-__gnet_stats_.patch new file mode 100644 index 000000000..513d4dc89 --- /dev/null +++ b/debian/patches-rt/0001-gen_stats-Add-instead-Set-the-value-in-__gnet_stats_.patch @@ -0,0 +1,164 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Sat, 16 Oct 2021 10:49:02 +0200 +Subject: [PATCH 1/9] gen_stats: Add instead Set the value in + __gnet_stats_copy_basic(). +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +__gnet_stats_copy_basic() always assigns the value to the bstats +argument overwriting the previous value. The later added per-CPU version +always accumulated the values in the returning gnet_stats_basic_packed +argument. + +Based on review there are five users of that function as of today: +- est_fetch_counters(), ___gnet_stats_copy_basic() + memsets() bstats to zero, single invocation. + +- mq_dump(), mqprio_dump(), mqprio_dump_class_stats() + memsets() bstats to zero, multiple invocation but does not use the + function due to !qdisc_is_percpu_stats(). + +Add the values in __gnet_stats_copy_basic() instead overwriting. Rename +the function to gnet_stats_add_basic() to make it more obvious. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: David S. Miller <davem@davemloft.net> +--- + include/net/gen_stats.h | 8 ++++---- + net/core/gen_estimator.c | 2 +- + net/core/gen_stats.c | 29 ++++++++++++++++------------- + net/sched/sch_mq.c | 5 ++--- + net/sched/sch_mqprio.c | 11 +++++------ + 5 files changed, 28 insertions(+), 27 deletions(-) + +--- a/include/net/gen_stats.h ++++ b/include/net/gen_stats.h +@@ -46,10 +46,10 @@ int gnet_stats_copy_basic(const seqcount + struct gnet_dump *d, + struct gnet_stats_basic_cpu __percpu *cpu, + struct gnet_stats_basic_packed *b); +-void __gnet_stats_copy_basic(const seqcount_t *running, +- struct gnet_stats_basic_packed *bstats, +- struct gnet_stats_basic_cpu __percpu *cpu, +- struct gnet_stats_basic_packed *b); ++void gnet_stats_add_basic(const seqcount_t *running, ++ struct gnet_stats_basic_packed *bstats, ++ struct gnet_stats_basic_cpu __percpu *cpu, ++ struct gnet_stats_basic_packed *b); + int gnet_stats_copy_basic_hw(const seqcount_t *running, + struct gnet_dump *d, + struct gnet_stats_basic_cpu __percpu *cpu, +--- a/net/core/gen_estimator.c ++++ b/net/core/gen_estimator.c +@@ -66,7 +66,7 @@ static void est_fetch_counters(struct ne + if (e->stats_lock) + spin_lock(e->stats_lock); + +- __gnet_stats_copy_basic(e->running, b, e->cpu_bstats, e->bstats); ++ gnet_stats_add_basic(e->running, b, e->cpu_bstats, e->bstats); + + if (e->stats_lock) + spin_unlock(e->stats_lock); +--- a/net/core/gen_stats.c ++++ b/net/core/gen_stats.c +@@ -114,9 +114,8 @@ gnet_stats_start_copy(struct sk_buff *sk + } + EXPORT_SYMBOL(gnet_stats_start_copy); + +-static void +-__gnet_stats_copy_basic_cpu(struct gnet_stats_basic_packed *bstats, +- struct gnet_stats_basic_cpu __percpu *cpu) ++static void gnet_stats_add_basic_cpu(struct gnet_stats_basic_packed *bstats, ++ struct gnet_stats_basic_cpu __percpu *cpu) + { + int i; + +@@ -136,26 +135,30 @@ static void + } + } + +-void +-__gnet_stats_copy_basic(const seqcount_t *running, +- struct gnet_stats_basic_packed *bstats, +- struct gnet_stats_basic_cpu __percpu *cpu, +- struct gnet_stats_basic_packed *b) ++void gnet_stats_add_basic(const seqcount_t *running, ++ struct gnet_stats_basic_packed *bstats, ++ struct gnet_stats_basic_cpu __percpu *cpu, ++ struct gnet_stats_basic_packed *b) + { + unsigned int seq; ++ u64 bytes = 0; ++ u64 packets = 0; + + if (cpu) { +- __gnet_stats_copy_basic_cpu(bstats, cpu); ++ gnet_stats_add_basic_cpu(bstats, cpu); + return; + } + do { + if (running) + seq = read_seqcount_begin(running); +- bstats->bytes = b->bytes; +- bstats->packets = b->packets; ++ bytes = b->bytes; ++ packets = b->packets; + } while (running && read_seqcount_retry(running, seq)); ++ ++ bstats->bytes += bytes; ++ bstats->packets += packets; + } +-EXPORT_SYMBOL(__gnet_stats_copy_basic); ++EXPORT_SYMBOL(gnet_stats_add_basic); + + static int + ___gnet_stats_copy_basic(const seqcount_t *running, +@@ -166,7 +169,7 @@ static int + { + struct gnet_stats_basic_packed bstats = {0}; + +- __gnet_stats_copy_basic(running, &bstats, cpu, b); ++ gnet_stats_add_basic(running, &bstats, cpu, b); + + if (d->compat_tc_stats && type == TCA_STATS_BASIC) { + d->tc_stats.bytes = bstats.bytes; +--- a/net/sched/sch_mq.c ++++ b/net/sched/sch_mq.c +@@ -147,9 +147,8 @@ static int mq_dump(struct Qdisc *sch, st + + if (qdisc_is_percpu_stats(qdisc)) { + qlen = qdisc_qlen_sum(qdisc); +- __gnet_stats_copy_basic(NULL, &sch->bstats, +- qdisc->cpu_bstats, +- &qdisc->bstats); ++ gnet_stats_add_basic(NULL, &sch->bstats, ++ qdisc->cpu_bstats, &qdisc->bstats); + __gnet_stats_copy_queue(&sch->qstats, + qdisc->cpu_qstats, + &qdisc->qstats, qlen); +--- a/net/sched/sch_mqprio.c ++++ b/net/sched/sch_mqprio.c +@@ -405,9 +405,8 @@ static int mqprio_dump(struct Qdisc *sch + if (qdisc_is_percpu_stats(qdisc)) { + __u32 qlen = qdisc_qlen_sum(qdisc); + +- __gnet_stats_copy_basic(NULL, &sch->bstats, +- qdisc->cpu_bstats, +- &qdisc->bstats); ++ gnet_stats_add_basic(NULL, &sch->bstats, ++ qdisc->cpu_bstats, &qdisc->bstats); + __gnet_stats_copy_queue(&sch->qstats, + qdisc->cpu_qstats, + &qdisc->qstats, qlen); +@@ -535,9 +534,9 @@ static int mqprio_dump_class_stats(struc + if (qdisc_is_percpu_stats(qdisc)) { + qlen = qdisc_qlen_sum(qdisc); + +- __gnet_stats_copy_basic(NULL, &bstats, +- qdisc->cpu_bstats, +- &qdisc->bstats); ++ gnet_stats_add_basic(NULL, &bstats, ++ qdisc->cpu_bstats, ++ &qdisc->bstats); + __gnet_stats_copy_queue(&qstats, + qdisc->cpu_qstats, + &qdisc->qstats, diff --git a/debian/patches-rt/0001-sched-Trigger-warning-if-migration_disabled-counter-.patch b/debian/patches-rt/0001-sched-Trigger-warning-if-migration_disabled-counter-.patch new file mode 100644 index 000000000..158e7aa70 --- /dev/null +++ b/debian/patches-rt/0001-sched-Trigger-warning-if-migration_disabled-counter-.patch @@ -0,0 +1,28 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Thu, 12 Aug 2021 14:40:05 +0200 +Subject: [PATCH 01/10] sched: Trigger warning if ->migration_disabled counter + underflows. +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +If migrate_enable() is used more often than its counter part then it +remains undetected and rq::nr_pinned will underflow, too. + +Add a warning if migrate_enable() is attempted if without a matching a +migrate_disable(). + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + kernel/sched/core.c | 2 ++ + 1 file changed, 2 insertions(+) + +--- a/kernel/sched/core.c ++++ b/kernel/sched/core.c +@@ -2152,6 +2152,8 @@ void migrate_enable(void) + if (p->migration_disabled > 1) { + p->migration_disabled--; + return; ++ } else if (WARN_ON_ONCE(p->migration_disabled == 0)) { ++ return; + } + + /* diff --git a/debian/patches-rt/0001-z3fold-remove-preempt-disabled-sections-for-RT.patch b/debian/patches-rt/0001-z3fold-remove-preempt-disabled-sections-for-RT.patch deleted file mode 100644 index 1510395a7..000000000 --- a/debian/patches-rt/0001-z3fold-remove-preempt-disabled-sections-for-RT.patch +++ /dev/null @@ -1,86 +0,0 @@ -From 706a4dc9e97bd7cb1dd7873f858e0925950e0ea9 Mon Sep 17 00:00:00 2001 -From: Vitaly Wool <vitaly.wool@konsulko.com> -Date: Mon, 14 Dec 2020 19:12:36 -0800 -Subject: [PATCH 001/296] z3fold: remove preempt disabled sections for RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Replace get_cpu_ptr() with migrate_disable()+this_cpu_ptr() so RT can take -spinlocks that become sleeping locks. - -Signed-off-by Mike Galbraith <efault@gmx.de> - -Link: https://lkml.kernel.org/r/20201209145151.18994-3-vitaly.wool@konsulko.com -Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.com> -Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Signed-off-by: Andrew Morton <akpm@linux-foundation.org> -Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - mm/z3fold.c | 17 ++++++++++------- - 1 file changed, 10 insertions(+), 7 deletions(-) - -diff --git a/mm/z3fold.c b/mm/z3fold.c -index 8ae944eeb8e2..36d810cac99d 100644 ---- a/mm/z3fold.c -+++ b/mm/z3fold.c -@@ -623,14 +623,16 @@ static inline void add_to_unbuddied(struct z3fold_pool *pool, - { - if (zhdr->first_chunks == 0 || zhdr->last_chunks == 0 || - zhdr->middle_chunks == 0) { -- struct list_head *unbuddied = get_cpu_ptr(pool->unbuddied); -- -+ struct list_head *unbuddied; - int freechunks = num_free_chunks(zhdr); -+ -+ migrate_disable(); -+ unbuddied = this_cpu_ptr(pool->unbuddied); - spin_lock(&pool->lock); - list_add(&zhdr->buddy, &unbuddied[freechunks]); - spin_unlock(&pool->lock); - zhdr->cpu = smp_processor_id(); -- put_cpu_ptr(pool->unbuddied); -+ migrate_enable(); - } - } - -@@ -880,8 +882,9 @@ static inline struct z3fold_header *__z3fold_alloc(struct z3fold_pool *pool, - int chunks = size_to_chunks(size), i; - - lookup: -+ migrate_disable(); - /* First, try to find an unbuddied z3fold page. */ -- unbuddied = get_cpu_ptr(pool->unbuddied); -+ unbuddied = this_cpu_ptr(pool->unbuddied); - for_each_unbuddied_list(i, chunks) { - struct list_head *l = &unbuddied[i]; - -@@ -899,7 +902,7 @@ static inline struct z3fold_header *__z3fold_alloc(struct z3fold_pool *pool, - !z3fold_page_trylock(zhdr)) { - spin_unlock(&pool->lock); - zhdr = NULL; -- put_cpu_ptr(pool->unbuddied); -+ migrate_enable(); - if (can_sleep) - cond_resched(); - goto lookup; -@@ -913,7 +916,7 @@ static inline struct z3fold_header *__z3fold_alloc(struct z3fold_pool *pool, - test_bit(PAGE_CLAIMED, &page->private)) { - z3fold_page_unlock(zhdr); - zhdr = NULL; -- put_cpu_ptr(pool->unbuddied); -+ migrate_enable(); - if (can_sleep) - cond_resched(); - goto lookup; -@@ -928,7 +931,7 @@ static inline struct z3fold_header *__z3fold_alloc(struct z3fold_pool *pool, - kref_get(&zhdr->refcount); - break; - } -- put_cpu_ptr(pool->unbuddied); -+ migrate_enable(); - - if (!zhdr) { - int cpu; --- -2.30.2 - diff --git a/debian/patches-rt/0001_documentation_kcov_include_types_h_in_the_example.patch b/debian/patches-rt/0001_documentation_kcov_include_types_h_in_the_example.patch new file mode 100644 index 000000000..3efce87f3 --- /dev/null +++ b/debian/patches-rt/0001_documentation_kcov_include_types_h_in_the_example.patch @@ -0,0 +1,38 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Subject: Documentation/kcov: Include types.h in the example. +Date: Mon, 30 Aug 2021 19:26:23 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The first example code has includes at the top, the following two +example share that part. The last example (remote coverage collection) +requires the linux/types.h header file due its __aligned_u64 usage. + +Add the linux/types.h to the top most example and a comment that the +header files from above are required as it is done in the second +example. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20210830172627.267989-2-bigeasy@linutronix.de +--- + Documentation/dev-tools/kcov.rst | 3 +++ + 1 file changed, 3 insertions(+) + +--- a/Documentation/dev-tools/kcov.rst ++++ b/Documentation/dev-tools/kcov.rst +@@ -50,6 +50,7 @@ The following program demonstrates cover + #include <sys/mman.h> + #include <unistd.h> + #include <fcntl.h> ++ #include <linux/types.h> + + #define KCOV_INIT_TRACE _IOR('c', 1, unsigned long) + #define KCOV_ENABLE _IO('c', 100) +@@ -251,6 +252,8 @@ selectively from different subsystems. + + .. code-block:: c + ++ /* Same includes and defines as above. */ ++ + struct kcov_remote_arg { + __u32 trace_mode; + __u32 area_size; diff --git a/debian/patches-rt/0001_sched_clean_up_the_might_sleep_underscore_zoo.patch b/debian/patches-rt/0001_sched_clean_up_the_might_sleep_underscore_zoo.patch new file mode 100644 index 000000000..8145127b6 --- /dev/null +++ b/debian/patches-rt/0001_sched_clean_up_the_might_sleep_underscore_zoo.patch @@ -0,0 +1,133 @@ +From: Thomas Gleixner <tglx@linutronix.de> +Subject: sched: Clean up the might_sleep() underscore zoo +Date: Thu, 23 Sep 2021 18:54:35 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +__might_sleep() vs. ___might_sleep() is hard to distinguish. Aside of that +the three underscore variant is exposed to provide a checkpoint for +rescheduling points which are distinct from blocking points. + +They are semantically a preemption point which means that scheduling is +state preserving. A real blocking operation, e.g. mutex_lock(), wait*(), +which cannot preserve a task state which is not equal to RUNNING. + +While technically blocking on a "sleeping" spinlock in RT enabled kernels +falls into the voluntary scheduling category because it has to wait until +the contended spin/rw lock becomes available, the RT lock substitution code +can semantically be mapped to a voluntary preemption because the RT lock +substitution code and the scheduler are providing mechanisms to preserve +the task state and to take regular non-lock related wakeups into account. + +Rename ___might_sleep() to __might_resched() to make the distinction of +these functions clear. + +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20210923165357.928693482@linutronix.de +--- + include/linux/kernel.h | 6 +++--- + include/linux/sched.h | 8 ++++---- + kernel/locking/spinlock_rt.c | 6 +++--- + kernel/sched/core.c | 6 +++--- + 4 files changed, 13 insertions(+), 13 deletions(-) + +--- a/include/linux/kernel.h ++++ b/include/linux/kernel.h +@@ -111,7 +111,7 @@ static __always_inline void might_resche + #endif /* CONFIG_PREEMPT_* */ + + #ifdef CONFIG_DEBUG_ATOMIC_SLEEP +-extern void ___might_sleep(const char *file, int line, int preempt_offset); ++extern void __might_resched(const char *file, int line, int preempt_offset); + extern void __might_sleep(const char *file, int line, int preempt_offset); + extern void __cant_sleep(const char *file, int line, int preempt_offset); + extern void __cant_migrate(const char *file, int line); +@@ -168,8 +168,8 @@ extern void __cant_migrate(const char *f + */ + # define non_block_end() WARN_ON(current->non_block_count-- == 0) + #else +- static inline void ___might_sleep(const char *file, int line, +- int preempt_offset) { } ++ static inline void __might_resched(const char *file, int line, ++ int preempt_offset) { } + static inline void __might_sleep(const char *file, int line, + int preempt_offset) { } + # define might_sleep() do { might_resched(); } while (0) +--- a/include/linux/sched.h ++++ b/include/linux/sched.h +@@ -2049,7 +2049,7 @@ static inline int _cond_resched(void) { + #endif /* !defined(CONFIG_PREEMPTION) || defined(CONFIG_PREEMPT_DYNAMIC) */ + + #define cond_resched() ({ \ +- ___might_sleep(__FILE__, __LINE__, 0); \ ++ __might_resched(__FILE__, __LINE__, 0); \ + _cond_resched(); \ + }) + +@@ -2057,9 +2057,9 @@ extern int __cond_resched_lock(spinlock_ + extern int __cond_resched_rwlock_read(rwlock_t *lock); + extern int __cond_resched_rwlock_write(rwlock_t *lock); + +-#define cond_resched_lock(lock) ({ \ +- ___might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET);\ +- __cond_resched_lock(lock); \ ++#define cond_resched_lock(lock) ({ \ ++ __might_resched(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET); \ ++ __cond_resched_lock(lock); \ + }) + + #define cond_resched_rwlock_read(lock) ({ \ +--- a/kernel/locking/spinlock_rt.c ++++ b/kernel/locking/spinlock_rt.c +@@ -32,7 +32,7 @@ static __always_inline void rtlock_lock( + + static __always_inline void __rt_spin_lock(spinlock_t *lock) + { +- ___might_sleep(__FILE__, __LINE__, 0); ++ __might_resched(__FILE__, __LINE__, 0); + rtlock_lock(&lock->lock); + rcu_read_lock(); + migrate_disable(); +@@ -210,7 +210,7 @@ EXPORT_SYMBOL(rt_write_trylock); + + void __sched rt_read_lock(rwlock_t *rwlock) + { +- ___might_sleep(__FILE__, __LINE__, 0); ++ __might_resched(__FILE__, __LINE__, 0); + rwlock_acquire_read(&rwlock->dep_map, 0, 0, _RET_IP_); + rwbase_read_lock(&rwlock->rwbase, TASK_RTLOCK_WAIT); + rcu_read_lock(); +@@ -220,7 +220,7 @@ EXPORT_SYMBOL(rt_read_lock); + + void __sched rt_write_lock(rwlock_t *rwlock) + { +- ___might_sleep(__FILE__, __LINE__, 0); ++ __might_resched(__FILE__, __LINE__, 0); + rwlock_acquire(&rwlock->dep_map, 0, 0, _RET_IP_); + rwbase_write_lock(&rwlock->rwbase, TASK_RTLOCK_WAIT); + rcu_read_lock(); +--- a/kernel/sched/core.c ++++ b/kernel/sched/core.c +@@ -9489,11 +9489,11 @@ void __might_sleep(const char *file, int + (void *)current->task_state_change, + (void *)current->task_state_change); + +- ___might_sleep(file, line, preempt_offset); ++ __might_resched(file, line, preempt_offset); + } + EXPORT_SYMBOL(__might_sleep); + +-void ___might_sleep(const char *file, int line, int preempt_offset) ++void __might_resched(const char *file, int line, int preempt_offset) + { + /* Ratelimiting timestamp: */ + static unsigned long prev_jiffy; +@@ -9538,7 +9538,7 @@ void ___might_sleep(const char *file, in + dump_stack(); + add_taint(TAINT_WARN, LOCKDEP_STILL_OK); + } +-EXPORT_SYMBOL(___might_sleep); ++EXPORT_SYMBOL(__might_resched); + + void __cant_sleep(const char *file, int line, int preempt_offset) + { diff --git a/debian/patches-rt/0001_sched_limit_the_number_of_task_migrations_per_batch_on_rt.patch b/debian/patches-rt/0001_sched_limit_the_number_of_task_migrations_per_batch_on_rt.patch new file mode 100644 index 000000000..69f2bb898 --- /dev/null +++ b/debian/patches-rt/0001_sched_limit_the_number_of_task_migrations_per_batch_on_rt.patch @@ -0,0 +1,32 @@ +From: Thomas Gleixner <tglx@linutronix.de> +Subject: sched: Limit the number of task migrations per batch on RT +Date: Tue, 28 Sep 2021 14:24:25 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +Batched task migrations are a source for large latencies as they keep the +scheduler from running while processing the migrations. + +Limit the batch size to 8 instead of 32 when running on a RT enabled +kernel. + +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20210928122411.425097596@linutronix.de +--- + kernel/sched/core.c | 4 ++++ + 1 file changed, 4 insertions(+) +--- +--- a/kernel/sched/core.c ++++ b/kernel/sched/core.c +@@ -74,7 +74,11 @@ const_debug unsigned int sysctl_sched_fe + * Number of tasks to iterate in a single balance run. + * Limited because this is done with IRQs disabled. + */ ++#ifdef CONFIG_PREEMPT_RT ++const_debug unsigned int sysctl_sched_nr_migrate = 8; ++#else + const_debug unsigned int sysctl_sched_nr_migrate = 32; ++#endif + + /* + * period over which we measure -rt task CPU usage in us. diff --git a/debian/patches-rt/0001_sched_rt_annotate_the_rt_balancing_logic_irqwork_as_irq_work_hard_irq.patch b/debian/patches-rt/0001_sched_rt_annotate_the_rt_balancing_logic_irqwork_as_irq_work_hard_irq.patch new file mode 100644 index 000000000..9c318e14b --- /dev/null +++ b/debian/patches-rt/0001_sched_rt_annotate_the_rt_balancing_logic_irqwork_as_irq_work_hard_irq.patch @@ -0,0 +1,38 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Subject: sched/rt: Annotate the RT balancing logic irqwork as IRQ_WORK_HARD_IRQ +Date: Wed, 06 Oct 2021 13:18:49 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The push-IPI logic for RT tasks expects to be invoked from hardirq +context. One reason is that a RT task on the remote CPU would block the +softirq processing on PREEMPT_RT and so avoid pulling / balancing the RT +tasks as intended. + +Annotate root_domain::rto_push_work as IRQ_WORK_HARD_IRQ. + +Cc: Ingo Molnar <mingo@redhat.com> +Cc: Peter Zijlstra <peterz@infradead.org> +Cc: Juri Lelli <juri.lelli@redhat.com> +Cc: Vincent Guittot <vincent.guittot@linaro.org> +Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> +Cc: Steven Rostedt <rostedt@goodmis.org> +Cc: Ben Segall <bsegall@google.com> +Cc: Mel Gorman <mgorman@suse.de> +Cc: Daniel Bristot de Oliveira <bristot@redhat.com> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20211006111852.1514359-2-bigeasy@linutronix.de +--- + kernel/sched/topology.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/kernel/sched/topology.c ++++ b/kernel/sched/topology.c +@@ -526,7 +526,7 @@ static int init_rootdomain(struct root_d + #ifdef HAVE_RT_PUSH_IPI + rd->rto_cpu = -1; + raw_spin_lock_init(&rd->rto_lock); +- init_irq_work(&rd->rto_push_work, rto_push_irq_work_func); ++ rd->rto_push_work = IRQ_WORK_INIT_HARD(rto_push_irq_work_func); + #endif + + rd->visit_gen = 0; diff --git a/debian/patches-rt/0002-drm-i915-Don-t-disable-interrupts-and-pretend-a-lock.patch b/debian/patches-rt/0002-drm-i915-Don-t-disable-interrupts-and-pretend-a-lock.patch new file mode 100644 index 000000000..5c01d5d70 --- /dev/null +++ b/debian/patches-rt/0002-drm-i915-Don-t-disable-interrupts-and-pretend-a-lock.patch @@ -0,0 +1,138 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Tue, 7 Jul 2020 12:25:11 +0200 +Subject: [PATCH 02/10] drm/i915: Don't disable interrupts and pretend a lock + as been acquired in __timeline_mark_lock(). +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +This is a revert of commits + d67739268cf0e ("drm/i915/gt: Mark up the nested engine-pm timeline lock as irqsafe") + 6c69a45445af9 ("drm/i915/gt: Mark context->active_count as protected by timeline->mutex") + +The existing code leads to a different behaviour depending on whether +lockdep is enabled or not. Any following lock that is acquired without +disabling interrupts (but needs to) will not be noticed by lockdep. + +This it not just a lockdep annotation but is used but an actual mutex_t +that is properly used as a lock but in case of __timeline_mark_lock() +lockdep is only told that it is acquired but no lock has been acquired. + +It appears that its purpose is just satisfy the lockdep_assert_held() +check in intel_context_mark_active(). The other problem with disabling +interrupts is that on PREEMPT_RT interrupts are also disabled which +leads to problems for instance later during memory allocation. + +Add a CONTEXT_IS_PARKED bit to intel_engine_cs and set_bit/clear_bit it +instead of mutex_acquire/mutex_release. Use test_bit in the two +identified spots which relied on the lockdep annotation. + +Cc: Peter Zijlstra <peterz@infradead.org> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + drivers/gpu/drm/i915/gt/intel_context.h | 3 +- + drivers/gpu/drm/i915/gt/intel_context_types.h | 1 + drivers/gpu/drm/i915/gt/intel_engine_pm.c | 38 +------------------------- + drivers/gpu/drm/i915/i915_request.h | 3 +- + 4 files changed, 7 insertions(+), 38 deletions(-) + +--- a/drivers/gpu/drm/i915/gt/intel_context.h ++++ b/drivers/gpu/drm/i915/gt/intel_context.h +@@ -163,7 +163,8 @@ static inline void intel_context_enter(s + + static inline void intel_context_mark_active(struct intel_context *ce) + { +- lockdep_assert_held(&ce->timeline->mutex); ++ lockdep_assert(lockdep_is_held(&ce->timeline->mutex) || ++ test_bit(CONTEXT_IS_PARKED, &ce->flags)); + ++ce->active_count; + } + +--- a/drivers/gpu/drm/i915/gt/intel_context_types.h ++++ b/drivers/gpu/drm/i915/gt/intel_context_types.h +@@ -112,6 +112,7 @@ struct intel_context { + #define CONTEXT_FORCE_SINGLE_SUBMISSION 7 + #define CONTEXT_NOPREEMPT 8 + #define CONTEXT_LRCA_DIRTY 9 ++#define CONTEXT_IS_PARKED 10 + + struct { + u64 timeout_us; +--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c ++++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c +@@ -80,39 +80,6 @@ static int __engine_unpark(struct intel_ + return 0; + } + +-#if IS_ENABLED(CONFIG_LOCKDEP) +- +-static unsigned long __timeline_mark_lock(struct intel_context *ce) +-{ +- unsigned long flags; +- +- local_irq_save(flags); +- mutex_acquire(&ce->timeline->mutex.dep_map, 2, 0, _THIS_IP_); +- +- return flags; +-} +- +-static void __timeline_mark_unlock(struct intel_context *ce, +- unsigned long flags) +-{ +- mutex_release(&ce->timeline->mutex.dep_map, _THIS_IP_); +- local_irq_restore(flags); +-} +- +-#else +- +-static unsigned long __timeline_mark_lock(struct intel_context *ce) +-{ +- return 0; +-} +- +-static void __timeline_mark_unlock(struct intel_context *ce, +- unsigned long flags) +-{ +-} +- +-#endif /* !IS_ENABLED(CONFIG_LOCKDEP) */ +- + static void duration(struct dma_fence *fence, struct dma_fence_cb *cb) + { + struct i915_request *rq = to_request(fence); +@@ -159,7 +126,6 @@ static bool switch_to_kernel_context(str + { + struct intel_context *ce = engine->kernel_context; + struct i915_request *rq; +- unsigned long flags; + bool result = true; + + /* GPU is pointing to the void, as good as in the kernel context. */ +@@ -201,7 +167,7 @@ static bool switch_to_kernel_context(str + * engine->wakeref.count, we may see the request completion and retire + * it causing an underflow of the engine->wakeref. + */ +- flags = __timeline_mark_lock(ce); ++ set_bit(CONTEXT_IS_PARKED, &ce->flags); + GEM_BUG_ON(atomic_read(&ce->timeline->active_count) < 0); + + rq = __i915_request_create(ce, GFP_NOWAIT); +@@ -233,7 +199,7 @@ static bool switch_to_kernel_context(str + + result = false; + out_unlock: +- __timeline_mark_unlock(ce, flags); ++ clear_bit(CONTEXT_IS_PARKED, &ce->flags); + return result; + } + +--- a/drivers/gpu/drm/i915/i915_request.h ++++ b/drivers/gpu/drm/i915/i915_request.h +@@ -609,7 +609,8 @@ i915_request_timeline(const struct i915_ + { + /* Valid only while the request is being constructed (or retired). */ + return rcu_dereference_protected(rq->timeline, +- lockdep_is_held(&rcu_access_pointer(rq->timeline)->mutex)); ++ lockdep_is_held(&rcu_access_pointer(rq->timeline)->mutex) || ++ test_bit(CONTEXT_IS_PARKED, &rq->context->flags)); + } + + static inline struct i915_gem_context * diff --git a/debian/patches-rt/0002-gen_stats-Add-gnet_stats_add_queue.patch b/debian/patches-rt/0002-gen_stats-Add-gnet_stats_add_queue.patch new file mode 100644 index 000000000..f49b856a0 --- /dev/null +++ b/debian/patches-rt/0002-gen_stats-Add-gnet_stats_add_queue.patch @@ -0,0 +1,69 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Sat, 16 Oct 2021 10:49:03 +0200 +Subject: [PATCH 2/9] gen_stats: Add gnet_stats_add_queue(). +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +This function will replace __gnet_stats_copy_queue(). It reads all +arguments and adds them into the passed gnet_stats_queue argument. +In contrast to __gnet_stats_copy_queue() it also copies the qlen member. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: David S. Miller <davem@davemloft.net> +--- + include/net/gen_stats.h | 3 +++ + net/core/gen_stats.c | 32 ++++++++++++++++++++++++++++++++ + 2 files changed, 35 insertions(+) + +--- a/include/net/gen_stats.h ++++ b/include/net/gen_stats.h +@@ -62,6 +62,9 @@ int gnet_stats_copy_queue(struct gnet_du + void __gnet_stats_copy_queue(struct gnet_stats_queue *qstats, + const struct gnet_stats_queue __percpu *cpu_q, + const struct gnet_stats_queue *q, __u32 qlen); ++void gnet_stats_add_queue(struct gnet_stats_queue *qstats, ++ const struct gnet_stats_queue __percpu *cpu_q, ++ const struct gnet_stats_queue *q); + int gnet_stats_copy_app(struct gnet_dump *d, void *st, int len); + + int gnet_stats_finish_copy(struct gnet_dump *d); +--- a/net/core/gen_stats.c ++++ b/net/core/gen_stats.c +@@ -321,6 +321,38 @@ void __gnet_stats_copy_queue(struct gnet + } + EXPORT_SYMBOL(__gnet_stats_copy_queue); + ++static void gnet_stats_add_queue_cpu(struct gnet_stats_queue *qstats, ++ const struct gnet_stats_queue __percpu *q) ++{ ++ int i; ++ ++ for_each_possible_cpu(i) { ++ const struct gnet_stats_queue *qcpu = per_cpu_ptr(q, i); ++ ++ qstats->qlen += qcpu->backlog; ++ qstats->backlog += qcpu->backlog; ++ qstats->drops += qcpu->drops; ++ qstats->requeues += qcpu->requeues; ++ qstats->overlimits += qcpu->overlimits; ++ } ++} ++ ++void gnet_stats_add_queue(struct gnet_stats_queue *qstats, ++ const struct gnet_stats_queue __percpu *cpu, ++ const struct gnet_stats_queue *q) ++{ ++ if (cpu) { ++ gnet_stats_add_queue_cpu(qstats, cpu); ++ } else { ++ qstats->qlen += q->qlen; ++ qstats->backlog += q->backlog; ++ qstats->drops += q->drops; ++ qstats->requeues += q->requeues; ++ qstats->overlimits += q->overlimits; ++ } ++} ++EXPORT_SYMBOL(gnet_stats_add_queue); ++ + /** + * gnet_stats_copy_queue - copy queue statistics into statistics TLV + * @d: dumping handle diff --git a/debian/patches-rt/0002-stop_machine-Add-function-and-caller-debug-info.patch b/debian/patches-rt/0002-stop_machine-Add-function-and-caller-debug-info.patch deleted file mode 100644 index f75d66585..000000000 --- a/debian/patches-rt/0002-stop_machine-Add-function-and-caller-debug-info.patch +++ /dev/null @@ -1,135 +0,0 @@ -From 2c62fc85d24c3b361dae4b3cbc251e9cbe24d6b2 Mon Sep 17 00:00:00 2001 -From: Peter Zijlstra <peterz@infradead.org> -Date: Fri, 23 Oct 2020 12:11:59 +0200 -Subject: [PATCH 002/296] stop_machine: Add function and caller debug info -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Crashes in stop-machine are hard to connect to the calling code, add a -little something to help with that. - -Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/stop_machine.h | 5 +++++ - kernel/stop_machine.c | 23 ++++++++++++++++++++--- - lib/dump_stack.c | 2 ++ - 3 files changed, 27 insertions(+), 3 deletions(-) - ---- a/include/linux/stop_machine.h -+++ b/include/linux/stop_machine.h -@@ -24,6 +24,7 @@ - struct cpu_stop_work { - struct list_head list; /* cpu_stopper->works */ - cpu_stop_fn_t fn; -+ unsigned long caller; - void *arg; - struct cpu_stop_done *done; - }; -@@ -36,6 +37,8 @@ - void stop_machine_unpark(int cpu); - void stop_machine_yield(const struct cpumask *cpumask); - -+extern void print_stop_info(const char *log_lvl, struct task_struct *task); -+ - #else /* CONFIG_SMP */ - - #include <linux/workqueue.h> -@@ -80,6 +83,8 @@ - return false; - } - -+static inline void print_stop_info(const char *log_lvl, struct task_struct *task) { } -+ - #endif /* CONFIG_SMP */ - - /* ---- a/kernel/stop_machine.c -+++ b/kernel/stop_machine.c -@@ -42,11 +42,23 @@ - struct list_head works; /* list of pending works */ - - struct cpu_stop_work stop_work; /* for stop_cpus */ -+ unsigned long caller; -+ cpu_stop_fn_t fn; - }; - - static DEFINE_PER_CPU(struct cpu_stopper, cpu_stopper); - static bool stop_machine_initialized = false; - -+void print_stop_info(const char *log_lvl, struct task_struct *task) -+{ -+ struct cpu_stopper *stopper = this_cpu_ptr(&cpu_stopper); -+ -+ if (task != stopper->thread) -+ return; -+ -+ printk("%sStopper: %pS <- %pS\n", log_lvl, stopper->fn, (void *)stopper->caller); -+} -+ - /* static data for stop_cpus */ - static DEFINE_MUTEX(stop_cpus_mutex); - static bool stop_cpus_in_progress; -@@ -123,7 +135,7 @@ - int stop_one_cpu(unsigned int cpu, cpu_stop_fn_t fn, void *arg) - { - struct cpu_stop_done done; -- struct cpu_stop_work work = { .fn = fn, .arg = arg, .done = &done }; -+ struct cpu_stop_work work = { .fn = fn, .arg = arg, .done = &done, .caller = _RET_IP_ }; - - cpu_stop_init_done(&done, 1); - if (!cpu_stop_queue_work(cpu, &work)) -@@ -331,7 +343,8 @@ - work1 = work2 = (struct cpu_stop_work){ - .fn = multi_cpu_stop, - .arg = &msdata, -- .done = &done -+ .done = &done, -+ .caller = _RET_IP_, - }; - - cpu_stop_init_done(&done, 2); -@@ -367,7 +380,7 @@ - bool stop_one_cpu_nowait(unsigned int cpu, cpu_stop_fn_t fn, void *arg, - struct cpu_stop_work *work_buf) - { -- *work_buf = (struct cpu_stop_work){ .fn = fn, .arg = arg, }; -+ *work_buf = (struct cpu_stop_work){ .fn = fn, .arg = arg, .caller = _RET_IP_, }; - return cpu_stop_queue_work(cpu, work_buf); - } - -@@ -487,6 +500,8 @@ - int ret; - - /* cpu stop callbacks must not sleep, make in_atomic() == T */ -+ stopper->caller = work->caller; -+ stopper->fn = fn; - preempt_count_inc(); - ret = fn(arg); - if (done) { -@@ -495,6 +510,8 @@ - cpu_stop_signal_done(done); - } - preempt_count_dec(); -+ stopper->fn = NULL; -+ stopper->caller = 0; - WARN_ONCE(preempt_count(), - "cpu_stop: %ps(%p) leaked preempt count\n", fn, arg); - goto repeat; ---- a/lib/dump_stack.c -+++ b/lib/dump_stack.c -@@ -12,6 +12,7 @@ - #include <linux/atomic.h> - #include <linux/kexec.h> - #include <linux/utsname.h> -+#include <linux/stop_machine.h> - #include <generated/package.h> - - static char dump_stack_arch_desc_str[128]; -@@ -59,6 +60,7 @@ - log_lvl, dump_stack_arch_desc_str); - - print_worker_info(log_lvl, current); -+ print_stop_info(log_lvl, current); - } - - /** diff --git a/debian/patches-rt/0002_documentation_kcov_define_ip_in_the_example.patch b/debian/patches-rt/0002_documentation_kcov_define_ip_in_the_example.patch new file mode 100644 index 000000000..0731c43a2 --- /dev/null +++ b/debian/patches-rt/0002_documentation_kcov_define_ip_in_the_example.patch @@ -0,0 +1,27 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Subject: Documentation/kcov: Define `ip' in the example. +Date: Mon, 30 Aug 2021 19:26:24 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The example code uses the variable `ip' but never declares it. + +Declare `ip' as a 64bit variable which is the same type as the array +from which it loads its value. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20210830172627.267989-3-bigeasy@linutronix.de +--- + Documentation/dev-tools/kcov.rst | 2 ++ + 1 file changed, 2 insertions(+) + +--- a/Documentation/dev-tools/kcov.rst ++++ b/Documentation/dev-tools/kcov.rst +@@ -178,6 +178,8 @@ Comparison operands collection + /* Read number of comparisons collected. */ + n = __atomic_load_n(&cover[0], __ATOMIC_RELAXED); + for (i = 0; i < n; i++) { ++ uint64_t ip; ++ + type = cover[i * KCOV_WORDS_PER_CMP + 1]; + /* arg1 and arg2 - operands of the comparison. */ + arg1 = cover[i * KCOV_WORDS_PER_CMP + 2]; diff --git a/debian/patches-rt/0002_irq_work_allow_irq_work_sync_to_sleep_if_irq_work_no_irq_support.patch b/debian/patches-rt/0002_irq_work_allow_irq_work_sync_to_sleep_if_irq_work_no_irq_support.patch new file mode 100644 index 000000000..c273e441b --- /dev/null +++ b/debian/patches-rt/0002_irq_work_allow_irq_work_sync_to_sleep_if_irq_work_no_irq_support.patch @@ -0,0 +1,74 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Subject: irq_work: Allow irq_work_sync() to sleep if irq_work() no IRQ support. +Date: Wed, 06 Oct 2021 13:18:50 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +irq_work() triggers instantly an interrupt if supported by the +architecture. Otherwise the work will be processed on the next timer +tick. In worst case irq_work_sync() could spin up to a jiffy. + +irq_work_sync() is usually used in tear down context which is fully +preemptible. Based on review irq_work_sync() is invoked from preemptible +context and there is one waiter at a time. This qualifies it to use +rcuwait for synchronisation. + +Let irq_work_sync() synchronize with rcuwait if the architecture +processes irqwork via the timer tick. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20211006111852.1514359-3-bigeasy@linutronix.de +--- + include/linux/irq_work.h | 3 +++ + kernel/irq_work.c | 10 ++++++++++ + 2 files changed, 13 insertions(+) + +--- a/include/linux/irq_work.h ++++ b/include/linux/irq_work.h +@@ -3,6 +3,7 @@ + #define _LINUX_IRQ_WORK_H + + #include <linux/smp_types.h> ++#include <linux/rcuwait.h> + + /* + * An entry can be in one of four states: +@@ -16,11 +17,13 @@ + struct irq_work { + struct __call_single_node node; + void (*func)(struct irq_work *); ++ struct rcuwait irqwait; + }; + + #define __IRQ_WORK_INIT(_func, _flags) (struct irq_work){ \ + .node = { .u_flags = (_flags), }, \ + .func = (_func), \ ++ .irqwait = __RCUWAIT_INITIALIZER(irqwait), \ + } + + #define IRQ_WORK_INIT(_func) __IRQ_WORK_INIT(_func, 0) +--- a/kernel/irq_work.c ++++ b/kernel/irq_work.c +@@ -160,6 +160,9 @@ void irq_work_single(void *arg) + * else claimed it meanwhile. + */ + (void)atomic_cmpxchg(&work->node.a_flags, flags, flags & ~IRQ_WORK_BUSY); ++ ++ if (!arch_irq_work_has_interrupt()) ++ rcuwait_wake_up(&work->irqwait); + } + + static void irq_work_run_list(struct llist_head *list) +@@ -204,6 +207,13 @@ void irq_work_tick(void) + void irq_work_sync(struct irq_work *work) + { + lockdep_assert_irqs_enabled(); ++ might_sleep(); ++ ++ if (!arch_irq_work_has_interrupt()) { ++ rcuwait_wait_event(&work->irqwait, !irq_work_is_busy(work), ++ TASK_UNINTERRUPTIBLE); ++ return; ++ } + + while (irq_work_is_busy(work)) + cpu_relax(); diff --git a/debian/patches-rt/0002_sched_disable_ttwu_queue_on_rt.patch b/debian/patches-rt/0002_sched_disable_ttwu_queue_on_rt.patch new file mode 100644 index 000000000..32e8437a5 --- /dev/null +++ b/debian/patches-rt/0002_sched_disable_ttwu_queue_on_rt.patch @@ -0,0 +1,40 @@ +From: Thomas Gleixner <tglx@linutronix.de> +Subject: sched: Disable TTWU_QUEUE on RT +Date: Tue, 28 Sep 2021 14:24:27 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The queued remote wakeup mechanism has turned out to be suboptimal for RT +enabled kernels. The maximum latencies go up by a factor of > 5x in certain +scenarious. + +This is caused by either long wake lists or by a large number of TTWU IPIs +which are processed back to back. + +Disable it for RT. + +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20210928122411.482262764@linutronix.de +--- + kernel/sched/features.h | 5 +++++ + 1 file changed, 5 insertions(+) +--- +--- a/kernel/sched/features.h ++++ b/kernel/sched/features.h +@@ -46,11 +46,16 @@ SCHED_FEAT(DOUBLE_TICK, false) + */ + SCHED_FEAT(NONTASK_CAPACITY, true) + ++#ifdef CONFIG_PREEMPT_RT ++SCHED_FEAT(TTWU_QUEUE, false) ++#else ++ + /* + * Queue remote wakeups on the target CPU and process them + * using the scheduler IPI. Reduces rq->lock contention/bounces. + */ + SCHED_FEAT(TTWU_QUEUE, true) ++#endif + + /* + * When doing wakeups, attempt to limit superfluous scans of the LLC domain. diff --git a/debian/patches-rt/0002_sched_make_cond_resched__lock_variants_consistent_vs_might_sleep.patch b/debian/patches-rt/0002_sched_make_cond_resched__lock_variants_consistent_vs_might_sleep.patch new file mode 100644 index 000000000..9465bebcd --- /dev/null +++ b/debian/patches-rt/0002_sched_make_cond_resched__lock_variants_consistent_vs_might_sleep.patch @@ -0,0 +1,47 @@ +From: Thomas Gleixner <tglx@linutronix.de> +Subject: sched: Make cond_resched_*lock() variants consistent vs. might_sleep() +Date: Thu, 23 Sep 2021 18:54:37 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +Commit 3427445afd26 ("sched: Exclude cond_resched() from nested sleep +test") removed the task state check of __might_sleep() for +cond_resched_lock() because cond_resched_lock() is not a voluntary +scheduling point which blocks. It's a preemption point which requires the +lock holder to release the spin lock. + +The same rationale applies to cond_resched_rwlock_read/write(), but those +were not touched. + +Make it consistent and use the non-state checking __might_resched() there +as well. + +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20210923165357.991262778@linutronix.de +--- + include/linux/sched.h | 12 ++++++------ + 1 file changed, 6 insertions(+), 6 deletions(-) + +--- a/include/linux/sched.h ++++ b/include/linux/sched.h +@@ -2062,14 +2062,14 @@ extern int __cond_resched_rwlock_write(r + __cond_resched_lock(lock); \ + }) + +-#define cond_resched_rwlock_read(lock) ({ \ +- __might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET); \ +- __cond_resched_rwlock_read(lock); \ ++#define cond_resched_rwlock_read(lock) ({ \ ++ __might_resched(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET); \ ++ __cond_resched_rwlock_read(lock); \ + }) + +-#define cond_resched_rwlock_write(lock) ({ \ +- __might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET); \ +- __cond_resched_rwlock_write(lock); \ ++#define cond_resched_rwlock_write(lock) ({ \ ++ __might_resched(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET); \ ++ __cond_resched_rwlock_write(lock); \ + }) + + static inline void cond_resched_rcu(void) diff --git a/debian/patches-rt/0003-drm-i915-Use-preempt_disable-enable_rt-where-recomme.patch b/debian/patches-rt/0003-drm-i915-Use-preempt_disable-enable_rt-where-recomme.patch new file mode 100644 index 000000000..f864edec6 --- /dev/null +++ b/debian/patches-rt/0003-drm-i915-Use-preempt_disable-enable_rt-where-recomme.patch @@ -0,0 +1,56 @@ +From: Mike Galbraith <umgwanakikbuti@gmail.com> +Date: Sat, 27 Feb 2016 08:09:11 +0100 +Subject: [PATCH 03/10] drm/i915: Use preempt_disable/enable_rt() where + recommended +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +Mario Kleiner suggest in commit + ad3543ede630f ("drm/intel: Push get_scanout_position() timestamping into kms driver.") + +a spots where preemption should be disabled on PREEMPT_RT. The +difference is that on PREEMPT_RT the intel_uncore::lock disables neither +preemption nor interrupts and so region remains preemptible. + +The area covers only register reads and writes. The part that worries me +is: +- __intel_get_crtc_scanline() the worst case is 100us if no match is + found. + +- intel_crtc_scanlines_since_frame_timestamp() not sure how long this + may take in the worst case. + +It was in the RT queue for a while and nobody complained. +Disable preemption on PREEPMPT_RT during timestamping. + +[bigeasy: patch description.] + +Cc: Mario Kleiner <mario.kleiner.de@gmail.com> +Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + drivers/gpu/drm/i915/i915_irq.c | 6 ++++-- + 1 file changed, 4 insertions(+), 2 deletions(-) + +--- a/drivers/gpu/drm/i915/i915_irq.c ++++ b/drivers/gpu/drm/i915/i915_irq.c +@@ -886,7 +886,8 @@ static bool i915_get_crtc_scanoutpos(str + */ + spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + +- /* preempt_disable_rt() should go right here in PREEMPT_RT patchset. */ ++ if (IS_ENABLED(CONFIG_PREEMPT_RT)) ++ preempt_disable(); + + /* Get optional system timestamp before query. */ + if (stime) +@@ -950,7 +951,8 @@ static bool i915_get_crtc_scanoutpos(str + if (etime) + *etime = ktime_get(); + +- /* preempt_enable_rt() should go right here in PREEMPT_RT patchset. */ ++ if (IS_ENABLED(CONFIG_PREEMPT_RT)) ++ preempt_enable(); + + spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); + diff --git a/debian/patches-rt/0003-mq-mqprio-Use-gnet_stats_add_queue.patch b/debian/patches-rt/0003-mq-mqprio-Use-gnet_stats_add_queue.patch new file mode 100644 index 000000000..87512956f --- /dev/null +++ b/debian/patches-rt/0003-mq-mqprio-Use-gnet_stats_add_queue.patch @@ -0,0 +1,139 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Sat, 16 Oct 2021 10:49:04 +0200 +Subject: [PATCH 3/9] mq, mqprio: Use gnet_stats_add_queue(). +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +gnet_stats_add_basic() and gnet_stats_add_queue() add up the statistics +so they can be used directly for both the per-CPU and global case. + +gnet_stats_add_queue() copies either Qdisc's per-CPU +gnet_stats_queue::qlen or the global member. The global +gnet_stats_queue::qlen isn't touched in the per-CPU case so there is no +need to consider it in the global-case. + +In the per-CPU case, the sum of global gnet_stats_queue::qlen and +the per-CPU gnet_stats_queue::qlen was assigned to sch->q.qlen and +sch->qstats.qlen. Now both fields are copied individually. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: David S. Miller <davem@davemloft.net> +--- + net/sched/sch_mq.c | 24 +++++------------------- + net/sched/sch_mqprio.c | 49 ++++++++++++------------------------------------- + 2 files changed, 17 insertions(+), 56 deletions(-) + +--- a/net/sched/sch_mq.c ++++ b/net/sched/sch_mq.c +@@ -130,7 +130,6 @@ static int mq_dump(struct Qdisc *sch, st + struct net_device *dev = qdisc_dev(sch); + struct Qdisc *qdisc; + unsigned int ntx; +- __u32 qlen = 0; + + sch->q.qlen = 0; + memset(&sch->bstats, 0, sizeof(sch->bstats)); +@@ -145,24 +144,11 @@ static int mq_dump(struct Qdisc *sch, st + qdisc = netdev_get_tx_queue(dev, ntx)->qdisc_sleeping; + spin_lock_bh(qdisc_lock(qdisc)); + +- if (qdisc_is_percpu_stats(qdisc)) { +- qlen = qdisc_qlen_sum(qdisc); +- gnet_stats_add_basic(NULL, &sch->bstats, +- qdisc->cpu_bstats, &qdisc->bstats); +- __gnet_stats_copy_queue(&sch->qstats, +- qdisc->cpu_qstats, +- &qdisc->qstats, qlen); +- sch->q.qlen += qlen; +- } else { +- sch->q.qlen += qdisc->q.qlen; +- sch->bstats.bytes += qdisc->bstats.bytes; +- sch->bstats.packets += qdisc->bstats.packets; +- sch->qstats.qlen += qdisc->qstats.qlen; +- sch->qstats.backlog += qdisc->qstats.backlog; +- sch->qstats.drops += qdisc->qstats.drops; +- sch->qstats.requeues += qdisc->qstats.requeues; +- sch->qstats.overlimits += qdisc->qstats.overlimits; +- } ++ gnet_stats_add_basic(NULL, &sch->bstats, qdisc->cpu_bstats, ++ &qdisc->bstats); ++ gnet_stats_add_queue(&sch->qstats, qdisc->cpu_qstats, ++ &qdisc->qstats); ++ sch->q.qlen += qdisc_qlen(qdisc); + + spin_unlock_bh(qdisc_lock(qdisc)); + } +--- a/net/sched/sch_mqprio.c ++++ b/net/sched/sch_mqprio.c +@@ -402,24 +402,11 @@ static int mqprio_dump(struct Qdisc *sch + qdisc = netdev_get_tx_queue(dev, ntx)->qdisc_sleeping; + spin_lock_bh(qdisc_lock(qdisc)); + +- if (qdisc_is_percpu_stats(qdisc)) { +- __u32 qlen = qdisc_qlen_sum(qdisc); +- +- gnet_stats_add_basic(NULL, &sch->bstats, +- qdisc->cpu_bstats, &qdisc->bstats); +- __gnet_stats_copy_queue(&sch->qstats, +- qdisc->cpu_qstats, +- &qdisc->qstats, qlen); +- sch->q.qlen += qlen; +- } else { +- sch->q.qlen += qdisc->q.qlen; +- sch->bstats.bytes += qdisc->bstats.bytes; +- sch->bstats.packets += qdisc->bstats.packets; +- sch->qstats.backlog += qdisc->qstats.backlog; +- sch->qstats.drops += qdisc->qstats.drops; +- sch->qstats.requeues += qdisc->qstats.requeues; +- sch->qstats.overlimits += qdisc->qstats.overlimits; +- } ++ gnet_stats_add_basic(NULL, &sch->bstats, qdisc->cpu_bstats, ++ &qdisc->bstats); ++ gnet_stats_add_queue(&sch->qstats, qdisc->cpu_qstats, ++ &qdisc->qstats); ++ sch->q.qlen += qdisc_qlen(qdisc); + + spin_unlock_bh(qdisc_lock(qdisc)); + } +@@ -511,7 +498,7 @@ static int mqprio_dump_class_stats(struc + { + if (cl >= TC_H_MIN_PRIORITY) { + int i; +- __u32 qlen = 0; ++ __u32 qlen; + struct gnet_stats_queue qstats = {0}; + struct gnet_stats_basic_packed bstats = {0}; + struct net_device *dev = qdisc_dev(sch); +@@ -531,27 +518,15 @@ static int mqprio_dump_class_stats(struc + + spin_lock_bh(qdisc_lock(qdisc)); + +- if (qdisc_is_percpu_stats(qdisc)) { +- qlen = qdisc_qlen_sum(qdisc); ++ gnet_stats_add_basic(NULL, &bstats, qdisc->cpu_bstats, ++ &qdisc->bstats); ++ gnet_stats_add_queue(&qstats, qdisc->cpu_qstats, ++ &qdisc->qstats); ++ sch->q.qlen += qdisc_qlen(qdisc); + +- gnet_stats_add_basic(NULL, &bstats, +- qdisc->cpu_bstats, +- &qdisc->bstats); +- __gnet_stats_copy_queue(&qstats, +- qdisc->cpu_qstats, +- &qdisc->qstats, +- qlen); +- } else { +- qlen += qdisc->q.qlen; +- bstats.bytes += qdisc->bstats.bytes; +- bstats.packets += qdisc->bstats.packets; +- qstats.backlog += qdisc->qstats.backlog; +- qstats.drops += qdisc->qstats.drops; +- qstats.requeues += qdisc->qstats.requeues; +- qstats.overlimits += qdisc->qstats.overlimits; +- } + spin_unlock_bh(qdisc_lock(qdisc)); + } ++ qlen = qdisc_qlen(sch) + qstats.qlen; + + /* Reclaim root sleeping lock before completing stats */ + if (d->lock) diff --git a/debian/patches-rt/0003-rtmutex-Add-a-special-case-for-ww-mutex-handling.patch b/debian/patches-rt/0003-rtmutex-Add-a-special-case-for-ww-mutex-handling.patch new file mode 100644 index 000000000..017012785 --- /dev/null +++ b/debian/patches-rt/0003-rtmutex-Add-a-special-case-for-ww-mutex-handling.patch @@ -0,0 +1,48 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Fri, 13 Aug 2021 12:40:49 +0200 +Subject: [PATCH 03/10] rtmutex: Add a special case for ww-mutex handling. +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The lockdep selftest for ww-mutex assumes in a few cases the +ww_ctx->contending_lock assignment via __ww_mutex_check_kill() which +does not happen if the rtmutex detects the deadlock early. + +The testcase passes if the deadlock handling here is removed. This means +that it will work if multiple threads/tasks are involved and not just a +single one. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + kernel/locking/rtmutex.c | 20 +++++++++++++++++++- + 1 file changed, 19 insertions(+), 1 deletion(-) + +--- a/kernel/locking/rtmutex.c ++++ b/kernel/locking/rtmutex.c +@@ -1097,8 +1097,26 @@ static int __sched task_blocks_on_rt_mut + * which is wrong, as the other waiter is not in a deadlock + * situation. + */ +- if (owner == task) ++ if (owner == task) { ++#if defined(DEBUG_WW_MUTEXES) && defined(CONFIG_DEBUG_LOCKING_API_SELFTESTS) ++ /* ++ * The lockdep selftest for ww-mutex assumes in a few cases ++ * the ww_ctx->contending_lock assignment via ++ * __ww_mutex_check_kill() which does not happen if the rtmutex ++ * detects the deadlock early. ++ */ ++ if (build_ww_mutex() && ww_ctx) { ++ struct rt_mutex *rtm; ++ ++ /* Check whether the waiter should backout immediately */ ++ rtm = container_of(lock, struct rt_mutex, rtmutex); ++ ++ __ww_mutex_add_waiter(waiter, rtm, ww_ctx); ++ __ww_mutex_check_kill(rtm, waiter, ww_ctx); ++ } ++#endif + return -EDEADLK; ++ } + + raw_spin_lock(&task->pi_lock); + waiter->task = task; diff --git a/debian/patches-rt/0003-sched-Fix-balance_callback.patch b/debian/patches-rt/0003-sched-Fix-balance_callback.patch deleted file mode 100644 index 3448f4fc1..000000000 --- a/debian/patches-rt/0003-sched-Fix-balance_callback.patch +++ /dev/null @@ -1,235 +0,0 @@ -From 0d544cc765b0ebf25038748d52d2e76f256a1e56 Mon Sep 17 00:00:00 2001 -From: Peter Zijlstra <peterz@infradead.org> -Date: Fri, 23 Oct 2020 12:12:00 +0200 -Subject: [PATCH 003/296] sched: Fix balance_callback() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The intent of balance_callback() has always been to delay executing -balancing operations until the end of the current rq->lock section. -This is because balance operations must often drop rq->lock, and that -isn't safe in general. - -However, as noted by Scott, there were a few holes in that scheme; -balance_callback() was called after rq->lock was dropped, which means -another CPU can interleave and touch the callback list. - -Rework code to call the balance callbacks before dropping rq->lock -where possible, and otherwise splice the balance list onto a local -stack. - -This guarantees that the balance list must be empty when we take -rq->lock. IOW, we'll only ever run our own balance callbacks. - -Reported-by: Scott Wood <swood@redhat.com> -Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/sched/core.c | 119 +++++++++++++++++++++++++++---------------- - kernel/sched/sched.h | 3 ++ - 2 files changed, 78 insertions(+), 44 deletions(-) - -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index 3a150445e0cb..56f9a850c9bf 100644 ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -3487,6 +3487,69 @@ static inline void finish_task(struct task_struct *prev) - #endif - } - -+#ifdef CONFIG_SMP -+ -+static void do_balance_callbacks(struct rq *rq, struct callback_head *head) -+{ -+ void (*func)(struct rq *rq); -+ struct callback_head *next; -+ -+ lockdep_assert_held(&rq->lock); -+ -+ while (head) { -+ func = (void (*)(struct rq *))head->func; -+ next = head->next; -+ head->next = NULL; -+ head = next; -+ -+ func(rq); -+ } -+} -+ -+static inline struct callback_head *splice_balance_callbacks(struct rq *rq) -+{ -+ struct callback_head *head = rq->balance_callback; -+ -+ lockdep_assert_held(&rq->lock); -+ if (head) -+ rq->balance_callback = NULL; -+ -+ return head; -+} -+ -+static void __balance_callbacks(struct rq *rq) -+{ -+ do_balance_callbacks(rq, splice_balance_callbacks(rq)); -+} -+ -+static inline void balance_callbacks(struct rq *rq, struct callback_head *head) -+{ -+ unsigned long flags; -+ -+ if (unlikely(head)) { -+ raw_spin_lock_irqsave(&rq->lock, flags); -+ do_balance_callbacks(rq, head); -+ raw_spin_unlock_irqrestore(&rq->lock, flags); -+ } -+} -+ -+#else -+ -+static inline void __balance_callbacks(struct rq *rq) -+{ -+} -+ -+static inline struct callback_head *splice_balance_callbacks(struct rq *rq) -+{ -+ return NULL; -+} -+ -+static inline void balance_callbacks(struct rq *rq, struct callback_head *head) -+{ -+} -+ -+#endif -+ - static inline void - prepare_lock_switch(struct rq *rq, struct task_struct *next, struct rq_flags *rf) - { -@@ -3512,6 +3575,7 @@ static inline void finish_lock_switch(struct rq *rq) - * prev into current: - */ - spin_acquire(&rq->lock.dep_map, 0, 0, _THIS_IP_); -+ __balance_callbacks(rq); - raw_spin_unlock_irq(&rq->lock); - } - -@@ -3653,43 +3717,6 @@ static struct rq *finish_task_switch(struct task_struct *prev) - return rq; - } - --#ifdef CONFIG_SMP -- --/* rq->lock is NOT held, but preemption is disabled */ --static void __balance_callback(struct rq *rq) --{ -- struct callback_head *head, *next; -- void (*func)(struct rq *rq); -- unsigned long flags; -- -- raw_spin_lock_irqsave(&rq->lock, flags); -- head = rq->balance_callback; -- rq->balance_callback = NULL; -- while (head) { -- func = (void (*)(struct rq *))head->func; -- next = head->next; -- head->next = NULL; -- head = next; -- -- func(rq); -- } -- raw_spin_unlock_irqrestore(&rq->lock, flags); --} -- --static inline void balance_callback(struct rq *rq) --{ -- if (unlikely(rq->balance_callback)) -- __balance_callback(rq); --} -- --#else -- --static inline void balance_callback(struct rq *rq) --{ --} -- --#endif -- - /** - * schedule_tail - first thing a freshly forked thread must call. - * @prev: the thread we just switched away from. -@@ -3709,7 +3736,6 @@ asmlinkage __visible void schedule_tail(struct task_struct *prev) - */ - - rq = finish_task_switch(prev); -- balance_callback(rq); - preempt_enable(); - - if (current->set_child_tid) -@@ -4525,10 +4551,11 @@ static void __sched notrace __schedule(bool preempt) - rq = context_switch(rq, prev, next, &rf); - } else { - rq->clock_update_flags &= ~(RQCF_ACT_SKIP|RQCF_REQ_SKIP); -- rq_unlock_irq(rq, &rf); -- } - -- balance_callback(rq); -+ rq_unpin_lock(rq, &rf); -+ __balance_callbacks(rq); -+ raw_spin_unlock_irq(&rq->lock); -+ } - } - - void __noreturn do_task_dead(void) -@@ -4940,9 +4967,11 @@ void rt_mutex_setprio(struct task_struct *p, struct task_struct *pi_task) - out_unlock: - /* Avoid rq from going away on us: */ - preempt_disable(); -- __task_rq_unlock(rq, &rf); - -- balance_callback(rq); -+ rq_unpin_lock(rq, &rf); -+ __balance_callbacks(rq); -+ raw_spin_unlock(&rq->lock); -+ - preempt_enable(); - } - #else -@@ -5216,6 +5245,7 @@ static int __sched_setscheduler(struct task_struct *p, - int retval, oldprio, oldpolicy = -1, queued, running; - int new_effective_prio, policy = attr->sched_policy; - const struct sched_class *prev_class; -+ struct callback_head *head; - struct rq_flags rf; - int reset_on_fork; - int queue_flags = DEQUEUE_SAVE | DEQUEUE_MOVE | DEQUEUE_NOCLOCK; -@@ -5454,6 +5484,7 @@ static int __sched_setscheduler(struct task_struct *p, - - /* Avoid rq from going away on us: */ - preempt_disable(); -+ head = splice_balance_callbacks(rq); - task_rq_unlock(rq, p, &rf); - - if (pi) { -@@ -5462,7 +5493,7 @@ static int __sched_setscheduler(struct task_struct *p, - } - - /* Run balance callbacks after we've adjusted the PI chain: */ -- balance_callback(rq); -+ balance_callbacks(rq, head); - preempt_enable(); - - return 0; -diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h -index fac1b121d113..5e03f3210318 100644 ---- a/kernel/sched/sched.h -+++ b/kernel/sched/sched.h -@@ -1216,6 +1216,9 @@ static inline void rq_pin_lock(struct rq *rq, struct rq_flags *rf) - rq->clock_update_flags &= (RQCF_REQ_SKIP|RQCF_ACT_SKIP); - rf->clock_update_flags = 0; - #endif -+#ifdef CONFIG_SMP -+ SCHED_WARN_ON(rq->balance_callback); -+#endif - } - - static inline void rq_unpin_lock(struct rq *rq, struct rq_flags *rf) --- -2.30.2 - diff --git a/debian/patches-rt/0003_irq_work_handle_some_irq_work_in_a_per_cpu_thread_on_preempt_rt.patch b/debian/patches-rt/0003_irq_work_handle_some_irq_work_in_a_per_cpu_thread_on_preempt_rt.patch new file mode 100644 index 000000000..b7c04330d --- /dev/null +++ b/debian/patches-rt/0003_irq_work_handle_some_irq_work_in_a_per_cpu_thread_on_preempt_rt.patch @@ -0,0 +1,235 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Subject: irq_work: Handle some irq_work in a per-CPU thread on PREEMPT_RT +Date: Wed, 06 Oct 2021 13:18:51 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The irq_work callback is invoked in hard IRQ context. By default all +callbacks are scheduled for invocation right away (given supported by +the architecture) except for the ones marked IRQ_WORK_LAZY which are +delayed until the next timer-tick. + +While looking over the callbacks, some of them may acquire locks +(spinlock_t, rwlock_t) which are transformed into sleeping locks on +PREEMPT_RT and must not be acquired in hard IRQ context. +Changing the locks into locks which could be acquired in this context +will lead to other problems such as increased latencies if everything +in the chain has IRQ-off locks. This will not solve all the issues as +one callback has been noticed which invoked kref_put() and its callback +invokes kfree() and this can not be invoked in hardirq context. + +Some callbacks are required to be invoked in hardirq context even on +PREEMPT_RT to work properly. This includes for instance the NO_HZ +callback which needs to be able to observe the idle context. + +The callbacks which require to be run in hardirq have already been +marked. Use this information to split the callbacks onto the two lists +on PREEMPT_RT: +- lazy_list + Work items which are not marked with IRQ_WORK_HARD_IRQ will be added + to this list. Callbacks on this list will be invoked from a per-CPU + thread. + The handler here may acquire sleeping locks such as spinlock_t and + invoke kfree(). + +- raised_list + Work items which are marked with IRQ_WORK_HARD_IRQ will be added to + this list. They will be invoked in hardirq context and must not + acquire any sleeping locks. + +The wake up of the per-CPU thread occurs from irq_work handler/ +hardirq context. The thread runs with lowest RT priority to ensure it +runs before any SCHED_OTHER tasks do. + +[bigeasy: melt tglx's irq_work_tick_soft() which splits irq_work_tick() into a + hard and soft variant. Collected fixes over time from Steven + Rostedt and Mike Galbraith. Move to per-CPU threads instead of + softirq as suggested by PeterZ.] + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20211007092646.uhshe3ut2wkrcfzv@linutronix.de +--- + kernel/irq_work.c | 118 ++++++++++++++++++++++++++++++++++++++++++++++++------ + 1 file changed, 106 insertions(+), 12 deletions(-) + +--- a/kernel/irq_work.c ++++ b/kernel/irq_work.c +@@ -18,11 +18,36 @@ + #include <linux/cpu.h> + #include <linux/notifier.h> + #include <linux/smp.h> ++#include <linux/smpboot.h> + #include <asm/processor.h> + #include <linux/kasan.h> + + static DEFINE_PER_CPU(struct llist_head, raised_list); + static DEFINE_PER_CPU(struct llist_head, lazy_list); ++static DEFINE_PER_CPU(struct task_struct *, irq_workd); ++ ++static void wake_irq_workd(void) ++{ ++ struct task_struct *tsk = __this_cpu_read(irq_workd); ++ ++ if (!llist_empty(this_cpu_ptr(&lazy_list)) && tsk) ++ wake_up_process(tsk); ++} ++ ++#ifdef CONFIG_SMP ++static void irq_work_wake(struct irq_work *entry) ++{ ++ wake_irq_workd(); ++} ++ ++static DEFINE_PER_CPU(struct irq_work, irq_work_wakeup) = ++ IRQ_WORK_INIT_HARD(irq_work_wake); ++#endif ++ ++static int irq_workd_should_run(unsigned int cpu) ++{ ++ return !llist_empty(this_cpu_ptr(&lazy_list)); ++} + + /* + * Claim the entry so that no one else will poke at it. +@@ -52,15 +77,29 @@ void __weak arch_irq_work_raise(void) + /* Enqueue on current CPU, work must already be claimed and preempt disabled */ + static void __irq_work_queue_local(struct irq_work *work) + { ++ struct llist_head *list; ++ bool rt_lazy_work = false; ++ bool lazy_work = false; ++ int work_flags; ++ ++ work_flags = atomic_read(&work->node.a_flags); ++ if (work_flags & IRQ_WORK_LAZY) ++ lazy_work = true; ++ else if (IS_ENABLED(CONFIG_PREEMPT_RT) && ++ !(work_flags & IRQ_WORK_HARD_IRQ)) ++ rt_lazy_work = true; ++ ++ if (lazy_work || rt_lazy_work) ++ list = this_cpu_ptr(&lazy_list); ++ else ++ list = this_cpu_ptr(&raised_list); ++ ++ if (!llist_add(&work->node.llist, list)) ++ return; ++ + /* If the work is "lazy", handle it from next tick if any */ +- if (atomic_read(&work->node.a_flags) & IRQ_WORK_LAZY) { +- if (llist_add(&work->node.llist, this_cpu_ptr(&lazy_list)) && +- tick_nohz_tick_stopped()) +- arch_irq_work_raise(); +- } else { +- if (llist_add(&work->node.llist, this_cpu_ptr(&raised_list))) +- arch_irq_work_raise(); +- } ++ if (!lazy_work || tick_nohz_tick_stopped()) ++ arch_irq_work_raise(); + } + + /* Enqueue the irq work @work on the current CPU */ +@@ -104,17 +143,34 @@ bool irq_work_queue_on(struct irq_work * + if (cpu != smp_processor_id()) { + /* Arch remote IPI send/receive backend aren't NMI safe */ + WARN_ON_ONCE(in_nmi()); ++ ++ /* ++ * On PREEMPT_RT the items which are not marked as ++ * IRQ_WORK_HARD_IRQ are added to the lazy list and a HARD work ++ * item is used on the remote CPU to wake the thread. ++ */ ++ if (IS_ENABLED(CONFIG_PREEMPT_RT) && ++ !(atomic_read(&work->node.a_flags) & IRQ_WORK_HARD_IRQ)) { ++ ++ if (!llist_add(&work->node.llist, &per_cpu(lazy_list, cpu))) ++ goto out; ++ ++ work = &per_cpu(irq_work_wakeup, cpu); ++ if (!irq_work_claim(work)) ++ goto out; ++ } ++ + __smp_call_single_queue(cpu, &work->node.llist); + } else { + __irq_work_queue_local(work); + } ++out: + preempt_enable(); + + return true; + #endif /* CONFIG_SMP */ + } + +- + bool irq_work_needs_cpu(void) + { + struct llist_head *raised, *lazy; +@@ -170,7 +226,12 @@ static void irq_work_run_list(struct lli + struct irq_work *work, *tmp; + struct llist_node *llnode; + +- BUG_ON(!irqs_disabled()); ++ /* ++ * On PREEMPT_RT IRQ-work which is not marked as HARD will be processed ++ * in a per-CPU thread in preemptible context. Only the items which are ++ * marked as IRQ_WORK_HARD_IRQ will be processed in hardirq context. ++ */ ++ BUG_ON(!irqs_disabled() && !IS_ENABLED(CONFIG_PREEMPT_RT)); + + if (llist_empty(list)) + return; +@@ -187,7 +248,10 @@ static void irq_work_run_list(struct lli + void irq_work_run(void) + { + irq_work_run_list(this_cpu_ptr(&raised_list)); +- irq_work_run_list(this_cpu_ptr(&lazy_list)); ++ if (!IS_ENABLED(CONFIG_PREEMPT_RT)) ++ irq_work_run_list(this_cpu_ptr(&lazy_list)); ++ else ++ wake_irq_workd(); + } + EXPORT_SYMBOL_GPL(irq_work_run); + +@@ -197,7 +261,11 @@ void irq_work_tick(void) + + if (!llist_empty(raised) && !arch_irq_work_has_interrupt()) + irq_work_run_list(raised); +- irq_work_run_list(this_cpu_ptr(&lazy_list)); ++ ++ if (!IS_ENABLED(CONFIG_PREEMPT_RT)) ++ irq_work_run_list(this_cpu_ptr(&lazy_list)); ++ else ++ wake_irq_workd(); + } + + /* +@@ -219,3 +287,29 @@ void irq_work_sync(struct irq_work *work + cpu_relax(); + } + EXPORT_SYMBOL_GPL(irq_work_sync); ++ ++static void run_irq_workd(unsigned int cpu) ++{ ++ irq_work_run_list(this_cpu_ptr(&lazy_list)); ++} ++ ++static void irq_workd_setup(unsigned int cpu) ++{ ++ sched_set_fifo_low(current); ++} ++ ++static struct smp_hotplug_thread irqwork_threads = { ++ .store = &irq_workd, ++ .setup = irq_workd_setup, ++ .thread_should_run = irq_workd_should_run, ++ .thread_fn = run_irq_workd, ++ .thread_comm = "irq_work/%u", ++}; ++ ++static __init int irq_work_init_threads(void) ++{ ++ if (IS_ENABLED(CONFIG_PREEMPT_RT)) ++ BUG_ON(smpboot_register_percpu_thread(&irqwork_threads)); ++ return 0; ++} ++early_initcall(irq_work_init_threads); diff --git a/debian/patches-rt/0003_kcov_allocate_per_cpu_memory_on_the_relevant_node.patch b/debian/patches-rt/0003_kcov_allocate_per_cpu_memory_on_the_relevant_node.patch new file mode 100644 index 000000000..7326a329c --- /dev/null +++ b/debian/patches-rt/0003_kcov_allocate_per_cpu_memory_on_the_relevant_node.patch @@ -0,0 +1,30 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Subject: kcov: Allocate per-CPU memory on the relevant node. +Date: Mon, 30 Aug 2021 19:26:25 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +During boot kcov allocates per-CPU memory which is used later if remote/ +softirq processing is enabled. + +Allocate the per-CPU memory on the CPU local node to avoid cross node +memory access. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20210830172627.267989-4-bigeasy@linutronix.de +--- + kernel/kcov.c | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +--- a/kernel/kcov.c ++++ b/kernel/kcov.c +@@ -1034,8 +1034,8 @@ static int __init kcov_init(void) + int cpu; + + for_each_possible_cpu(cpu) { +- void *area = vmalloc(CONFIG_KCOV_IRQ_AREA_SIZE * +- sizeof(unsigned long)); ++ void *area = vmalloc_node(CONFIG_KCOV_IRQ_AREA_SIZE * ++ sizeof(unsigned long), cpu_to_node(cpu)); + if (!area) + return -ENOMEM; + per_cpu_ptr(&kcov_percpu_data, cpu)->irq_area = area; diff --git a/debian/patches-rt/0003_sched_move_kprobes_cleanup_out_of_finish_task_switch.patch b/debian/patches-rt/0003_sched_move_kprobes_cleanup_out_of_finish_task_switch.patch new file mode 100644 index 000000000..fb4208a4c --- /dev/null +++ b/debian/patches-rt/0003_sched_move_kprobes_cleanup_out_of_finish_task_switch.patch @@ -0,0 +1,72 @@ +From: Thomas Gleixner <tglx@linutronix.de> +Subject: sched: Move kprobes cleanup out of finish_task_switch() +Date: Tue, 28 Sep 2021 14:24:28 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +Doing cleanups in the tail of schedule() is a latency punishment for the +incoming task. The point of invoking kprobes_task_flush() for a dead task +is that the instances are returned and cannot leak when __schedule() is +kprobed. + +Move it into the delayed cleanup. + +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Cc: Masami Hiramatsu <mhiramat@kernel.org> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20210928122411.537994026@linutronix.de +--- + kernel/exit.c | 2 ++ + kernel/kprobes.c | 8 ++++---- + kernel/sched/core.c | 6 ------ + 3 files changed, 6 insertions(+), 10 deletions(-) + +--- a/kernel/exit.c ++++ b/kernel/exit.c +@@ -64,6 +64,7 @@ + #include <linux/rcuwait.h> + #include <linux/compat.h> + #include <linux/io_uring.h> ++#include <linux/kprobes.h> + + #include <linux/uaccess.h> + #include <asm/unistd.h> +@@ -168,6 +169,7 @@ static void delayed_put_task_struct(stru + { + struct task_struct *tsk = container_of(rhp, struct task_struct, rcu); + ++ kprobe_flush_task(tsk); + perf_event_delayed_put(tsk); + trace_sched_process_free(tsk); + put_task_struct(tsk); +--- a/kernel/kprobes.c ++++ b/kernel/kprobes.c +@@ -1250,10 +1250,10 @@ void kprobe_busy_end(void) + } + + /* +- * This function is called from finish_task_switch when task tk becomes dead, +- * so that we can recycle any function-return probe instances associated +- * with this task. These left over instances represent probed functions +- * that have been called but will never return. ++ * This function is called from delayed_put_task_struct() when a task is ++ * dead and cleaned up to recycle any function-return probe instances ++ * associated with this task. These left over instances represent probed ++ * functions that have been called but will never return. + */ + void kprobe_flush_task(struct task_struct *tk) + { +--- a/kernel/sched/core.c ++++ b/kernel/sched/core.c +@@ -4845,12 +4845,6 @@ static struct rq *finish_task_switch(str + if (prev->sched_class->task_dead) + prev->sched_class->task_dead(prev); + +- /* +- * Remove function-return probe instances associated with this +- * task and put them back on the free list. +- */ +- kprobe_flush_task(prev); +- + /* Task is done with its stack. */ + put_task_stack(prev); + diff --git a/debian/patches-rt/0003_sched_remove_preempt_offset_argument_from___might_sleep.patch b/debian/patches-rt/0003_sched_remove_preempt_offset_argument_from___might_sleep.patch new file mode 100644 index 000000000..ab2cb99a9 --- /dev/null +++ b/debian/patches-rt/0003_sched_remove_preempt_offset_argument_from___might_sleep.patch @@ -0,0 +1,77 @@ +From: Thomas Gleixner <tglx@linutronix.de> +Subject: sched: Remove preempt_offset argument from __might_sleep() +Date: Thu, 23 Sep 2021 18:54:38 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +All callers hand in 0 and never will hand in anything else. + +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20210923165358.054321586@linutronix.de +--- + include/linux/kernel.h | 7 +++---- + kernel/sched/core.c | 4 ++-- + mm/memory.c | 2 +- + 3 files changed, 6 insertions(+), 7 deletions(-) + +--- a/include/linux/kernel.h ++++ b/include/linux/kernel.h +@@ -112,7 +112,7 @@ static __always_inline void might_resche + + #ifdef CONFIG_DEBUG_ATOMIC_SLEEP + extern void __might_resched(const char *file, int line, int preempt_offset); +-extern void __might_sleep(const char *file, int line, int preempt_offset); ++extern void __might_sleep(const char *file, int line); + extern void __cant_sleep(const char *file, int line, int preempt_offset); + extern void __cant_migrate(const char *file, int line); + +@@ -129,7 +129,7 @@ extern void __cant_migrate(const char *f + * supposed to. + */ + # define might_sleep() \ +- do { __might_sleep(__FILE__, __LINE__, 0); might_resched(); } while (0) ++ do { __might_sleep(__FILE__, __LINE__); might_resched(); } while (0) + /** + * cant_sleep - annotation for functions that cannot sleep + * +@@ -170,8 +170,7 @@ extern void __cant_migrate(const char *f + #else + static inline void __might_resched(const char *file, int line, + int preempt_offset) { } +- static inline void __might_sleep(const char *file, int line, +- int preempt_offset) { } ++static inline void __might_sleep(const char *file, int line) { } + # define might_sleep() do { might_resched(); } while (0) + # define cant_sleep() do { } while (0) + # define cant_migrate() do { } while (0) +--- a/kernel/sched/core.c ++++ b/kernel/sched/core.c +@@ -9475,7 +9475,7 @@ static inline int preempt_count_equals(i + return (nested == preempt_offset); + } + +-void __might_sleep(const char *file, int line, int preempt_offset) ++void __might_sleep(const char *file, int line) + { + unsigned int state = get_current_state(); + /* +@@ -9489,7 +9489,7 @@ void __might_sleep(const char *file, int + (void *)current->task_state_change, + (void *)current->task_state_change); + +- __might_resched(file, line, preempt_offset); ++ __might_resched(file, line, 0); + } + EXPORT_SYMBOL(__might_sleep); + +--- a/mm/memory.c ++++ b/mm/memory.c +@@ -5256,7 +5256,7 @@ void __might_fault(const char *file, int + return; + if (pagefault_disabled()) + return; +- __might_sleep(file, line, 0); ++ __might_sleep(file, line); + #if defined(CONFIG_DEBUG_ATOMIC_SLEEP) + if (current->mm) + might_lock_read(¤t->mm->mmap_lock); diff --git a/debian/patches-rt/0258-drm-i915-Don-t-disable-interrupts-on-PREEMPT_RT-duri.patch b/debian/patches-rt/0004-drm-i915-Don-t-disable-interrupts-on-PREEMPT_RT-duri.patch index ca6c687d7..674fa6ec2 100644 --- a/debian/patches-rt/0258-drm-i915-Don-t-disable-interrupts-on-PREEMPT_RT-duri.patch +++ b/debian/patches-rt/0004-drm-i915-Don-t-disable-interrupts-on-PREEMPT_RT-duri.patch @@ -1,9 +1,8 @@ -From 5e15ec26326e11002e9d0d3e384e20cfac458930 Mon Sep 17 00:00:00 2001 From: Mike Galbraith <umgwanakikbuti@gmail.com> Date: Sat, 27 Feb 2016 09:01:42 +0100 -Subject: [PATCH 258/296] drm/i915: Don't disable interrupts on PREEMPT_RT - during atomic updates -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz +Subject: [PATCH 04/10] drm/i915: Don't disable interrupts on PREEMPT_RT during + atomic updates +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz Commit 8d7849db3eab7 ("drm/i915: Make sprite updates atomic") @@ -14,6 +13,17 @@ are sleeping locks on PREEMPT_RT. According to the comment the interrupts are disabled to avoid random delays and not required for protection or synchronisation. +If this needs to happen with disabled interrupts on PREEMPT_RT, and the +whole section is restricted to register access then all sleeping locks +need to be acquired before interrupts are disabled and some function +maybe moved after enabling interrupts again. +This includes: +- prepare_to_wait() + finish_wait() due its wake queue. +- drm_crtc_vblank_put() -> vblank_disable_fn() drm_device::vbl_lock. +- skl_pfit_enable(), intel_update_plane(), vlv_atomic_update_fifo() and + maybe others due to intel_uncore::lock +- drm_crtc_arm_vblank_event() due to drm_device::event_lock and + drm_device::vblank_time_lock. Don't disable interrupts on PREEMPT_RT during atomic updates. @@ -22,16 +32,14 @@ Don't disable interrupts on PREEMPT_RT during atomic updates. Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- - drivers/gpu/drm/i915/display/intel_sprite.c | 15 ++++++++++----- + drivers/gpu/drm/i915/display/intel_crtc.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) -diff --git a/drivers/gpu/drm/i915/display/intel_sprite.c b/drivers/gpu/drm/i915/display/intel_sprite.c -index 12f7128b777f..a65061e3e1d3 100644 ---- a/drivers/gpu/drm/i915/display/intel_sprite.c -+++ b/drivers/gpu/drm/i915/display/intel_sprite.c -@@ -118,7 +118,8 @@ void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state) - "PSR idle timed out 0x%x, atomic update may fail\n", - psr_status); +--- a/drivers/gpu/drm/i915/display/intel_crtc.c ++++ b/drivers/gpu/drm/i915/display/intel_crtc.c +@@ -425,7 +425,8 @@ void intel_pipe_update_start(const struc + */ + intel_psr_wait_for_idle(new_crtc_state); - local_irq_disable(); + if (!IS_ENABLED(CONFIG_PREEMPT_RT)) @@ -39,7 +47,7 @@ index 12f7128b777f..a65061e3e1d3 100644 crtc->debug.min_vbl = min; crtc->debug.max_vbl = max; -@@ -143,11 +144,13 @@ void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state) +@@ -450,11 +451,13 @@ void intel_pipe_update_start(const struc break; } @@ -55,7 +63,7 @@ index 12f7128b777f..a65061e3e1d3 100644 } finish_wait(wq, &wait); -@@ -180,7 +183,8 @@ void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state) +@@ -487,7 +490,8 @@ void intel_pipe_update_start(const struc return; irq_disable: @@ -64,8 +72,8 @@ index 12f7128b777f..a65061e3e1d3 100644 + local_irq_disable(); } - /** -@@ -218,7 +222,8 @@ void intel_pipe_update_end(struct intel_crtc_state *new_crtc_state) + #if IS_ENABLED(CONFIG_DRM_I915_DEBUG_VBLANK_EVADE) +@@ -566,7 +570,8 @@ void intel_pipe_update_end(struct intel_ new_crtc_state->uapi.event = NULL; } @@ -73,8 +81,5 @@ index 12f7128b777f..a65061e3e1d3 100644 + if (!IS_ENABLED(CONFIG_PREEMPT_RT)) + local_irq_enable(); - if (intel_vgpu_active(dev_priv)) - return; --- -2.30.2 - + /* Send VRR Push to terminate Vblank */ + intel_vrr_send_push(new_crtc_state); diff --git a/debian/patches-rt/0004-gen_stats-Move-remaining-users-to-gnet_stats_add_que.patch b/debian/patches-rt/0004-gen_stats-Move-remaining-users-to-gnet_stats_add_que.patch new file mode 100644 index 000000000..421e84227 --- /dev/null +++ b/debian/patches-rt/0004-gen_stats-Move-remaining-users-to-gnet_stats_add_que.patch @@ -0,0 +1,108 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Sat, 16 Oct 2021 10:49:05 +0200 +Subject: [PATCH 4/9] gen_stats: Move remaining users to + gnet_stats_add_queue(). +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The gnet_stats_queue::qlen member is only used in the SMP-case. + +qdisc_qstats_qlen_backlog() needs to add qdisc_qlen() to qstats.qlen to +have the same value as that provided by qdisc_qlen_sum(). + +gnet_stats_copy_queue() needs to overwritte the resulting qstats.qlen +field whith the caller submitted qlen value. It might be differ from the +submitted value. + +Let both functions use gnet_stats_add_queue() and remove unused +__gnet_stats_copy_queue(). + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: David S. Miller <davem@davemloft.net> +--- + include/net/gen_stats.h | 3 --- + include/net/sch_generic.h | 5 ++--- + net/core/gen_stats.c | 39 ++------------------------------------- + 3 files changed, 4 insertions(+), 43 deletions(-) + +--- a/include/net/gen_stats.h ++++ b/include/net/gen_stats.h +@@ -59,9 +59,6 @@ int gnet_stats_copy_rate_est(struct gnet + int gnet_stats_copy_queue(struct gnet_dump *d, + struct gnet_stats_queue __percpu *cpu_q, + struct gnet_stats_queue *q, __u32 qlen); +-void __gnet_stats_copy_queue(struct gnet_stats_queue *qstats, +- const struct gnet_stats_queue __percpu *cpu_q, +- const struct gnet_stats_queue *q, __u32 qlen); + void gnet_stats_add_queue(struct gnet_stats_queue *qstats, + const struct gnet_stats_queue __percpu *cpu_q, + const struct gnet_stats_queue *q); +--- a/include/net/sch_generic.h ++++ b/include/net/sch_generic.h +@@ -968,10 +968,9 @@ static inline void qdisc_qstats_qlen_bac + __u32 *backlog) + { + struct gnet_stats_queue qstats = { 0 }; +- __u32 len = qdisc_qlen_sum(sch); + +- __gnet_stats_copy_queue(&qstats, sch->cpu_qstats, &sch->qstats, len); +- *qlen = qstats.qlen; ++ gnet_stats_add_queue(&qstats, sch->cpu_qstats, &sch->qstats); ++ *qlen = qstats.qlen + qdisc_qlen(sch); + *backlog = qstats.backlog; + } + +--- a/net/core/gen_stats.c ++++ b/net/core/gen_stats.c +@@ -285,42 +285,6 @@ gnet_stats_copy_rate_est(struct gnet_dum + } + EXPORT_SYMBOL(gnet_stats_copy_rate_est); + +-static void +-__gnet_stats_copy_queue_cpu(struct gnet_stats_queue *qstats, +- const struct gnet_stats_queue __percpu *q) +-{ +- int i; +- +- for_each_possible_cpu(i) { +- const struct gnet_stats_queue *qcpu = per_cpu_ptr(q, i); +- +- qstats->qlen = 0; +- qstats->backlog += qcpu->backlog; +- qstats->drops += qcpu->drops; +- qstats->requeues += qcpu->requeues; +- qstats->overlimits += qcpu->overlimits; +- } +-} +- +-void __gnet_stats_copy_queue(struct gnet_stats_queue *qstats, +- const struct gnet_stats_queue __percpu *cpu, +- const struct gnet_stats_queue *q, +- __u32 qlen) +-{ +- if (cpu) { +- __gnet_stats_copy_queue_cpu(qstats, cpu); +- } else { +- qstats->qlen = q->qlen; +- qstats->backlog = q->backlog; +- qstats->drops = q->drops; +- qstats->requeues = q->requeues; +- qstats->overlimits = q->overlimits; +- } +- +- qstats->qlen = qlen; +-} +-EXPORT_SYMBOL(__gnet_stats_copy_queue); +- + static void gnet_stats_add_queue_cpu(struct gnet_stats_queue *qstats, + const struct gnet_stats_queue __percpu *q) + { +@@ -374,7 +338,8 @@ gnet_stats_copy_queue(struct gnet_dump * + { + struct gnet_stats_queue qstats = {0}; + +- __gnet_stats_copy_queue(&qstats, cpu_q, q, qlen); ++ gnet_stats_add_queue(&qstats, cpu_q, q); ++ qstats.qlen = qlen; + + if (d->compat_tc_stats) { + d->tc_stats.drops = qstats.drops; diff --git a/debian/patches-rt/0004-rtmutex-Add-rt_mutex_lock_nest_lock-and-rt_mutex_loc.patch b/debian/patches-rt/0004-rtmutex-Add-rt_mutex_lock_nest_lock-and-rt_mutex_loc.patch new file mode 100644 index 000000000..e0db57bfa --- /dev/null +++ b/debian/patches-rt/0004-rtmutex-Add-rt_mutex_lock_nest_lock-and-rt_mutex_loc.patch @@ -0,0 +1,116 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Fri, 13 Aug 2021 13:49:49 +0200 +Subject: [PATCH 04/10] rtmutex: Add rt_mutex_lock_nest_lock() and + rt_mutex_lock_killable(). +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The locking selftest for ww-mutex expects to operate directly on the +base-mutex which becomes a rtmutex on PREEMPT_RT. + +Add rt_mutex_lock_nest_lock(), follows mutex_lock_nest_lock() for +rtmutex. +Add rt_mutex_lock_killable(), follows mutex_lock_killable() for rtmutex. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + include/linux/rtmutex.h | 9 +++++++++ + kernel/locking/rtmutex_api.c | 30 ++++++++++++++++++++++++++---- + 2 files changed, 35 insertions(+), 4 deletions(-) + +--- a/include/linux/rtmutex.h ++++ b/include/linux/rtmutex.h +@@ -99,13 +99,22 @@ extern void __rt_mutex_init(struct rt_mu + + #ifdef CONFIG_DEBUG_LOCK_ALLOC + extern void rt_mutex_lock_nested(struct rt_mutex *lock, unsigned int subclass); ++extern void _rt_mutex_lock_nest_lock(struct rt_mutex *lock, struct lockdep_map *nest_lock); + #define rt_mutex_lock(lock) rt_mutex_lock_nested(lock, 0) ++#define rt_mutex_lock_nest_lock(lock, nest_lock) \ ++ do { \ ++ typecheck(struct lockdep_map *, &(nest_lock)->dep_map); \ ++ _rt_mutex_lock_nest_lock(lock, &(nest_lock)->dep_map); \ ++ } while (0) ++ + #else + extern void rt_mutex_lock(struct rt_mutex *lock); + #define rt_mutex_lock_nested(lock, subclass) rt_mutex_lock(lock) ++#define rt_mutex_lock_nest_lock(lock, nest_lock) rt_mutex_lock(lock) + #endif + + extern int rt_mutex_lock_interruptible(struct rt_mutex *lock); ++extern int rt_mutex_lock_killable(struct rt_mutex *lock); + extern int rt_mutex_trylock(struct rt_mutex *lock); + + extern void rt_mutex_unlock(struct rt_mutex *lock); +--- a/kernel/locking/rtmutex_api.c ++++ b/kernel/locking/rtmutex_api.c +@@ -21,12 +21,13 @@ int max_lock_depth = 1024; + */ + static __always_inline int __rt_mutex_lock_common(struct rt_mutex *lock, + unsigned int state, ++ struct lockdep_map *nest_lock, + unsigned int subclass) + { + int ret; + + might_sleep(); +- mutex_acquire(&lock->dep_map, subclass, 0, _RET_IP_); ++ mutex_acquire_nest(&lock->dep_map, subclass, 0, nest_lock, _RET_IP_); + ret = __rt_mutex_lock(&lock->rtmutex, state); + if (ret) + mutex_release(&lock->dep_map, _RET_IP_); +@@ -48,10 +49,16 @@ EXPORT_SYMBOL(rt_mutex_base_init); + */ + void __sched rt_mutex_lock_nested(struct rt_mutex *lock, unsigned int subclass) + { +- __rt_mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, subclass); ++ __rt_mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, NULL, subclass); + } + EXPORT_SYMBOL_GPL(rt_mutex_lock_nested); + ++void __sched _rt_mutex_lock_nest_lock(struct rt_mutex *lock, struct lockdep_map *nest_lock) ++{ ++ __rt_mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, nest_lock, 0); ++} ++EXPORT_SYMBOL_GPL(_rt_mutex_lock_nest_lock); ++ + #else /* !CONFIG_DEBUG_LOCK_ALLOC */ + + /** +@@ -61,7 +68,7 @@ EXPORT_SYMBOL_GPL(rt_mutex_lock_nested); + */ + void __sched rt_mutex_lock(struct rt_mutex *lock) + { +- __rt_mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, 0); ++ __rt_mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, NULL, 0); + } + EXPORT_SYMBOL_GPL(rt_mutex_lock); + #endif +@@ -77,11 +84,26 @@ EXPORT_SYMBOL_GPL(rt_mutex_lock); + */ + int __sched rt_mutex_lock_interruptible(struct rt_mutex *lock) + { +- return __rt_mutex_lock_common(lock, TASK_INTERRUPTIBLE, 0); ++ return __rt_mutex_lock_common(lock, TASK_INTERRUPTIBLE, NULL, 0); + } + EXPORT_SYMBOL_GPL(rt_mutex_lock_interruptible); + + /** ++ * rt_mutex_lock_killable - lock a rt_mutex killable ++ * ++ * @lock: the rt_mutex to be locked ++ * ++ * Returns: ++ * 0 on success ++ * -EINTR when interrupted by a signal ++ */ ++int __sched rt_mutex_lock_killable(struct rt_mutex *lock) ++{ ++ return __rt_mutex_lock_common(lock, TASK_KILLABLE, NULL, 0); ++} ++EXPORT_SYMBOL_GPL(rt_mutex_lock_killable); ++ ++/** + * rt_mutex_trylock - try to lock a rt_mutex + * + * @lock: the rt_mutex to be locked diff --git a/debian/patches-rt/0004-sched-hotplug-Ensure-only-per-cpu-kthreads-run-durin.patch b/debian/patches-rt/0004-sched-hotplug-Ensure-only-per-cpu-kthreads-run-durin.patch deleted file mode 100644 index 5c498a79b..000000000 --- a/debian/patches-rt/0004-sched-hotplug-Ensure-only-per-cpu-kthreads-run-durin.patch +++ /dev/null @@ -1,244 +0,0 @@ -From c66cb9f1491ba73a2ab6bc0fedc642bbeeacfdef Mon Sep 17 00:00:00 2001 -From: Peter Zijlstra <peterz@infradead.org> -Date: Fri, 23 Oct 2020 12:12:01 +0200 -Subject: [PATCH 004/296] sched/hotplug: Ensure only per-cpu kthreads run - during hotplug -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -In preparation for migrate_disable(), make sure only per-cpu kthreads -are allowed to run on !active CPUs. - -This is ran (as one of the very first steps) from the cpu-hotplug -task which is a per-cpu kthread and completion of the hotplug -operation only requires such tasks. - -This constraint enables the migrate_disable() implementation to wait -for completion of all migrate_disable regions on this CPU at hotplug -time without fear of any new ones starting. - -This replaces the unlikely(rq->balance_callbacks) test at the tail of -context_switch with an unlikely(rq->balance_work), the fast path is -not affected. - -Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/sched/core.c | 114 ++++++++++++++++++++++++++++++++++++++++++- - kernel/sched/sched.h | 7 ++- - 2 files changed, 118 insertions(+), 3 deletions(-) - -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index 56f9a850c9bf..feca16c2a780 100644 ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -3511,8 +3511,10 @@ static inline struct callback_head *splice_balance_callbacks(struct rq *rq) - struct callback_head *head = rq->balance_callback; - - lockdep_assert_held(&rq->lock); -- if (head) -+ if (head) { - rq->balance_callback = NULL; -+ rq->balance_flags &= ~BALANCE_WORK; -+ } - - return head; - } -@@ -3533,6 +3535,21 @@ static inline void balance_callbacks(struct rq *rq, struct callback_head *head) - } - } - -+static void balance_push(struct rq *rq); -+ -+static inline void balance_switch(struct rq *rq) -+{ -+ if (likely(!rq->balance_flags)) -+ return; -+ -+ if (rq->balance_flags & BALANCE_PUSH) { -+ balance_push(rq); -+ return; -+ } -+ -+ __balance_callbacks(rq); -+} -+ - #else - - static inline void __balance_callbacks(struct rq *rq) -@@ -3548,6 +3565,10 @@ static inline void balance_callbacks(struct rq *rq, struct callback_head *head) - { - } - -+static inline void balance_switch(struct rq *rq) -+{ -+} -+ - #endif - - static inline void -@@ -3575,7 +3596,7 @@ static inline void finish_lock_switch(struct rq *rq) - * prev into current: - */ - spin_acquire(&rq->lock.dep_map, 0, 0, _THIS_IP_); -- __balance_callbacks(rq); -+ balance_switch(rq); - raw_spin_unlock_irq(&rq->lock); - } - -@@ -6831,6 +6852,90 @@ static void migrate_tasks(struct rq *dead_rq, struct rq_flags *rf) - - rq->stop = stop; - } -+ -+static int __balance_push_cpu_stop(void *arg) -+{ -+ struct task_struct *p = arg; -+ struct rq *rq = this_rq(); -+ struct rq_flags rf; -+ int cpu; -+ -+ raw_spin_lock_irq(&p->pi_lock); -+ rq_lock(rq, &rf); -+ -+ update_rq_clock(rq); -+ -+ if (task_rq(p) == rq && task_on_rq_queued(p)) { -+ cpu = select_fallback_rq(rq->cpu, p); -+ rq = __migrate_task(rq, &rf, p, cpu); -+ } -+ -+ rq_unlock(rq, &rf); -+ raw_spin_unlock_irq(&p->pi_lock); -+ -+ put_task_struct(p); -+ -+ return 0; -+} -+ -+static DEFINE_PER_CPU(struct cpu_stop_work, push_work); -+ -+/* -+ * Ensure we only run per-cpu kthreads once the CPU goes !active. -+ */ -+static void balance_push(struct rq *rq) -+{ -+ struct task_struct *push_task = rq->curr; -+ -+ lockdep_assert_held(&rq->lock); -+ SCHED_WARN_ON(rq->cpu != smp_processor_id()); -+ -+ /* -+ * Both the cpu-hotplug and stop task are in this case and are -+ * required to complete the hotplug process. -+ */ -+ if (is_per_cpu_kthread(push_task)) -+ return; -+ -+ get_task_struct(push_task); -+ /* -+ * Temporarily drop rq->lock such that we can wake-up the stop task. -+ * Both preemption and IRQs are still disabled. -+ */ -+ raw_spin_unlock(&rq->lock); -+ stop_one_cpu_nowait(rq->cpu, __balance_push_cpu_stop, push_task, -+ this_cpu_ptr(&push_work)); -+ /* -+ * At this point need_resched() is true and we'll take the loop in -+ * schedule(). The next pick is obviously going to be the stop task -+ * which is_per_cpu_kthread() and will push this task away. -+ */ -+ raw_spin_lock(&rq->lock); -+} -+ -+static void balance_push_set(int cpu, bool on) -+{ -+ struct rq *rq = cpu_rq(cpu); -+ struct rq_flags rf; -+ -+ rq_lock_irqsave(rq, &rf); -+ if (on) -+ rq->balance_flags |= BALANCE_PUSH; -+ else -+ rq->balance_flags &= ~BALANCE_PUSH; -+ rq_unlock_irqrestore(rq, &rf); -+} -+ -+#else -+ -+static inline void balance_push(struct rq *rq) -+{ -+} -+ -+static inline void balance_push_set(int cpu, bool on) -+{ -+} -+ - #endif /* CONFIG_HOTPLUG_CPU */ - - void set_rq_online(struct rq *rq) -@@ -6916,6 +7021,8 @@ int sched_cpu_activate(unsigned int cpu) - struct rq *rq = cpu_rq(cpu); - struct rq_flags rf; - -+ balance_push_set(cpu, false); -+ - #ifdef CONFIG_SCHED_SMT - /* - * When going up, increment the number of cores with SMT present. -@@ -6963,6 +7070,8 @@ int sched_cpu_deactivate(unsigned int cpu) - */ - synchronize_rcu(); - -+ balance_push_set(cpu, true); -+ - #ifdef CONFIG_SCHED_SMT - /* - * When going down, decrement the number of cores with SMT present. -@@ -6976,6 +7085,7 @@ int sched_cpu_deactivate(unsigned int cpu) - - ret = cpuset_cpu_inactive(cpu); - if (ret) { -+ balance_push_set(cpu, false); - set_cpu_active(cpu, true); - return ret; - } -diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h -index 5e03f3210318..ca3106f66351 100644 ---- a/kernel/sched/sched.h -+++ b/kernel/sched/sched.h -@@ -967,6 +967,7 @@ struct rq { - unsigned long cpu_capacity_orig; - - struct callback_head *balance_callback; -+ unsigned char balance_flags; - - unsigned char nohz_idle_balance; - unsigned char idle_balance; -@@ -1380,6 +1381,9 @@ init_numa_balancing(unsigned long clone_flags, struct task_struct *p) - - #ifdef CONFIG_SMP - -+#define BALANCE_WORK 0x01 -+#define BALANCE_PUSH 0x02 -+ - static inline void - queue_balance_callback(struct rq *rq, - struct callback_head *head, -@@ -1387,12 +1391,13 @@ queue_balance_callback(struct rq *rq, - { - lockdep_assert_held(&rq->lock); - -- if (unlikely(head->next)) -+ if (unlikely(head->next || (rq->balance_flags & BALANCE_PUSH))) - return; - - head->func = (void (*)(struct callback_head *))func; - head->next = rq->balance_callback; - rq->balance_callback = head; -+ rq->balance_flags |= BALANCE_WORK; - } - - #define rcu_dereference_check_sched_domain(p) \ --- -2.30.2 - diff --git a/debian/patches-rt/0004_irq_work_also_rcuwait_for_irq_work_hard_irq_on_preempt_rt.patch b/debian/patches-rt/0004_irq_work_also_rcuwait_for_irq_work_hard_irq_on_preempt_rt.patch new file mode 100644 index 000000000..234297710 --- /dev/null +++ b/debian/patches-rt/0004_irq_work_also_rcuwait_for_irq_work_hard_irq_on_preempt_rt.patch @@ -0,0 +1,54 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Subject: irq_work: Also rcuwait for !IRQ_WORK_HARD_IRQ on PREEMPT_RT +Date: Wed, 06 Oct 2021 13:18:52 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +On PREEMPT_RT most items are processed as LAZY via softirq context. +Avoid to spin-wait for them because irq_work_sync() could have higher +priority and not allow the irq-work to be completed. + +Wait additionally for !IRQ_WORK_HARD_IRQ irq_work items on PREEMPT_RT. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20211006111852.1514359-5-bigeasy@linutronix.de +--- + include/linux/irq_work.h | 5 +++++ + kernel/irq_work.c | 6 ++++-- + 2 files changed, 9 insertions(+), 2 deletions(-) + +--- a/include/linux/irq_work.h ++++ b/include/linux/irq_work.h +@@ -49,6 +49,11 @@ static inline bool irq_work_is_busy(stru + return atomic_read(&work->node.a_flags) & IRQ_WORK_BUSY; + } + ++static inline bool irq_work_is_hard(struct irq_work *work) ++{ ++ return atomic_read(&work->node.a_flags) & IRQ_WORK_HARD_IRQ; ++} ++ + bool irq_work_queue(struct irq_work *work); + bool irq_work_queue_on(struct irq_work *work, int cpu); + +--- a/kernel/irq_work.c ++++ b/kernel/irq_work.c +@@ -217,7 +217,8 @@ void irq_work_single(void *arg) + */ + (void)atomic_cmpxchg(&work->node.a_flags, flags, flags & ~IRQ_WORK_BUSY); + +- if (!arch_irq_work_has_interrupt()) ++ if ((IS_ENABLED(CONFIG_PREEMPT_RT) && !irq_work_is_hard(work)) || ++ !arch_irq_work_has_interrupt()) + rcuwait_wake_up(&work->irqwait); + } + +@@ -277,7 +278,8 @@ void irq_work_sync(struct irq_work *work + lockdep_assert_irqs_enabled(); + might_sleep(); + +- if (!arch_irq_work_has_interrupt()) { ++ if ((IS_ENABLED(CONFIG_PREEMPT_RT) && !irq_work_is_hard(work)) || ++ !arch_irq_work_has_interrupt()) { + rcuwait_wait_event(&work->irqwait, !irq_work_is_busy(work), + TASK_UNINTERRUPTIBLE); + return; diff --git a/debian/patches-rt/0004_kcov_avoid_enable_disable_interrupts_if_in_task.patch b/debian/patches-rt/0004_kcov_avoid_enable_disable_interrupts_if_in_task.patch new file mode 100644 index 000000000..07e54b5a1 --- /dev/null +++ b/debian/patches-rt/0004_kcov_avoid_enable_disable_interrupts_if_in_task.patch @@ -0,0 +1,45 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Subject: kcov: Avoid enable+disable interrupts if !in_task(). +Date: Mon, 30 Aug 2021 19:26:26 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +kcov_remote_start() may need to allocate memory in the in_task() case +(otherwise per-CPU memory has been pre-allocated) and therefore requires +enabled interrupts. +The interrupts are enabled before checking if the allocation is required +so if no allocation is required then the interrupts are needlessly +enabled and disabled again. + +Enable interrupts only if memory allocation is performed. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20210830172627.267989-5-bigeasy@linutronix.de +--- + kernel/kcov.c | 6 +++--- + 1 file changed, 3 insertions(+), 3 deletions(-) + +--- a/kernel/kcov.c ++++ b/kernel/kcov.c +@@ -869,19 +869,19 @@ void kcov_remote_start(u64 handle) + size = CONFIG_KCOV_IRQ_AREA_SIZE; + area = this_cpu_ptr(&kcov_percpu_data)->irq_area; + } +- spin_unlock_irqrestore(&kcov_remote_lock, flags); ++ spin_unlock(&kcov_remote_lock); + + /* Can only happen when in_task(). */ + if (!area) { ++ local_irqrestore(flags); + area = vmalloc(size * sizeof(unsigned long)); + if (!area) { + kcov_put(kcov); + return; + } ++ local_irq_save(flags); + } + +- local_irq_save(flags); +- + /* Reset coverage size. */ + *(u64 *)area = 0; + diff --git a/debian/patches-rt/0004_sched_cleanup_might_sleep_printks.patch b/debian/patches-rt/0004_sched_cleanup_might_sleep_printks.patch new file mode 100644 index 000000000..b376ae4d4 --- /dev/null +++ b/debian/patches-rt/0004_sched_cleanup_might_sleep_printks.patch @@ -0,0 +1,39 @@ +From: Thomas Gleixner <tglx@linutronix.de> +Subject: sched: Cleanup might_sleep() printks +Date: Thu, 23 Sep 2021 18:54:40 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +Convert them to pr_*(). No functional change. + +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20210923165358.117496067@linutronix.de +--- + kernel/sched/core.c | 14 ++++++-------- + 1 file changed, 6 insertions(+), 8 deletions(-) + +--- a/kernel/sched/core.c ++++ b/kernel/sched/core.c +@@ -9516,16 +9516,14 @@ void __might_resched(const char *file, i + /* Save this before calling printk(), since that will clobber it: */ + preempt_disable_ip = get_preempt_disable_ip(current); + +- printk(KERN_ERR +- "BUG: sleeping function called from invalid context at %s:%d\n", +- file, line); +- printk(KERN_ERR +- "in_atomic(): %d, irqs_disabled(): %d, non_block: %d, pid: %d, name: %s\n", +- in_atomic(), irqs_disabled(), current->non_block_count, +- current->pid, current->comm); ++ pr_err("BUG: sleeping function called from invalid context at %s:%d\n", ++ file, line); ++ pr_err("in_atomic(): %d, irqs_disabled(): %d, non_block: %d, pid: %d, name: %s\n", ++ in_atomic(), irqs_disabled(), current->non_block_count, ++ current->pid, current->comm); + + if (task_stack_end_corrupted(current)) +- printk(KERN_EMERG "Thread overran stack, or stack corrupted\n"); ++ pr_emerg("Thread overran stack, or stack corrupted\n"); + + debug_show_held_locks(current); + if (irqs_disabled()) diff --git a/debian/patches-rt/0004_sched_delay_task_stack_freeing_on_rt.patch b/debian/patches-rt/0004_sched_delay_task_stack_freeing_on_rt.patch new file mode 100644 index 000000000..d7be73a28 --- /dev/null +++ b/debian/patches-rt/0004_sched_delay_task_stack_freeing_on_rt.patch @@ -0,0 +1,66 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Subject: sched: Delay task stack freeing on RT +Date: Tue, 28 Sep 2021 14:24:30 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +Anything which is done on behalf of a dead task at the end of +finish_task_switch() is preventing the incoming task from doing useful +work. While it is benefitial for fork heavy workloads to recycle the task +stack quickly, this is a latency source for real-time tasks. + +Therefore delay the stack cleanup on RT enabled kernels. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Link: https://lore.kernel.org/r/20210928122411.593486363@linutronix.de +--- + kernel/exit.c | 5 +++++ + kernel/fork.c | 5 ++++- + kernel/sched/core.c | 8 ++++++-- + 3 files changed, 15 insertions(+), 3 deletions(-) + +--- a/kernel/exit.c ++++ b/kernel/exit.c +@@ -172,6 +172,11 @@ static void delayed_put_task_struct(stru + kprobe_flush_task(tsk); + perf_event_delayed_put(tsk); + trace_sched_process_free(tsk); ++ ++ /* RT enabled kernels delay freeing the VMAP'ed task stack */ ++ if (IS_ENABLED(CONFIG_PREEMPT_RT)) ++ put_task_stack(tsk); ++ + put_task_struct(tsk); + } + +--- a/kernel/fork.c ++++ b/kernel/fork.c +@@ -289,7 +289,10 @@ static inline void free_thread_stack(str + return; + } + +- vfree_atomic(tsk->stack); ++ if (!IS_ENABLED(CONFIG_PREEMPT_RT)) ++ vfree_atomic(tsk->stack); ++ else ++ vfree(tsk->stack); + return; + } + #endif +--- a/kernel/sched/core.c ++++ b/kernel/sched/core.c +@@ -4845,8 +4845,12 @@ static struct rq *finish_task_switch(str + if (prev->sched_class->task_dead) + prev->sched_class->task_dead(prev); + +- /* Task is done with its stack. */ +- put_task_stack(prev); ++ /* ++ * Release VMAP'ed task stack immediate for reuse. On RT ++ * enabled kernels this is delayed for latency reasons. ++ */ ++ if (!IS_ENABLED(CONFIG_PREEMPT_RT)) ++ put_task_stack(prev); + + put_task_struct_rcu_user(prev); + } diff --git a/debian/patches-rt/0005-drm-i915-Don-t-check-for-atomic-context-on-PREEMPT_R.patch b/debian/patches-rt/0005-drm-i915-Don-t-check-for-atomic-context-on-PREEMPT_R.patch new file mode 100644 index 000000000..b2e2766b8 --- /dev/null +++ b/debian/patches-rt/0005-drm-i915-Don-t-check-for-atomic-context-on-PREEMPT_R.patch @@ -0,0 +1,30 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Mon, 25 Oct 2021 15:05:18 +0200 +Subject: [PATCH 05/10] drm/i915: Don't check for atomic context on PREEMPT_RT +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The !in_atomic() check in _wait_for_atomic() triggers on PREEMPT_RT +because the uncore::lock is a spinlock_t and does not disable +preemption or interrupts. + +Changing the uncore:lock to a raw_spinlock_t doubles the worst case +latency on an otherwise idle testbox during testing. Therefore I'm +currently unsure about changing this. + +Link: https://lore.kernel.org/all/20211006164628.s2mtsdd2jdbfyf7g@linutronix.de/ +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + drivers/gpu/drm/i915/i915_utils.h | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/drivers/gpu/drm/i915/i915_utils.h ++++ b/drivers/gpu/drm/i915/i915_utils.h +@@ -343,7 +343,7 @@ wait_remaining_ms_from_jiffies(unsigned + #define wait_for(COND, MS) _wait_for((COND), (MS) * 1000, 10, 1000) + + /* If CONFIG_PREEMPT_COUNT is disabled, in_atomic() always reports false. */ +-#if defined(CONFIG_DRM_I915_DEBUG) && defined(CONFIG_PREEMPT_COUNT) ++#if defined(CONFIG_DRM_I915_DEBUG) && defined(CONFIG_PREEMPT_COUNT) && !defined(CONFIG_PREEMPT_RT) + # define _WAIT_FOR_ATOMIC_CHECK(ATOMIC) WARN_ON_ONCE((ATOMIC) && !in_atomic()) + #else + # define _WAIT_FOR_ATOMIC_CHECK(ATOMIC) do { } while (0) diff --git a/debian/patches-rt/0253-lockdep-Make-it-RT-aware.patch b/debian/patches-rt/0005-lockdep-Make-it-RT-aware.patch index 4039fe6bf..7ee8e296c 100644 --- a/debian/patches-rt/0253-lockdep-Make-it-RT-aware.patch +++ b/debian/patches-rt/0005-lockdep-Make-it-RT-aware.patch @@ -1,19 +1,26 @@ -From 088e1a0e426c5c6fd9492213313a084f92d39fed Mon Sep 17 00:00:00 2001 From: Thomas Gleixner <tglx@linutronix.de> Date: Sun, 17 Jul 2011 18:51:23 +0200 -Subject: [PATCH 253/296] lockdep: Make it RT aware -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz +Subject: [PATCH 05/10] lockdep: Make it RT aware +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz -teach lockdep that we don't really do softirqs on -RT. +There is not really a softirq context on PREEMPT_RT. +Softirqs on PREEMPT_RT are always invoked within the context of a threaded +interrupt handler or within ksoftirqd. The "in-softirq" context is preemptible +and is protected by a per-CPU lock to ensure mutual exclusion. + +There is no difference on PREEMPT_RT between spin_lock_irq() and spin_lock() +because the former does not disable interrupts. Therefore if lock is used +in_softirq() and locked once with spin_lock_irq() then lockdep will report this +with "inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage". + +Teach lockdep that we don't really do softirqs on -RT. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- - include/linux/irqflags.h | 23 +++++++++++++++-------- - kernel/locking/lockdep.c | 2 ++ + include/linux/irqflags.h | 23 +++++++++++++++-------- + kernel/locking/lockdep.c | 2 ++ 2 files changed, 17 insertions(+), 8 deletions(-) -diff --git a/include/linux/irqflags.h b/include/linux/irqflags.h -index 3ed4e8771b64..a437b2e70d37 100644 --- a/include/linux/irqflags.h +++ b/include/linux/irqflags.h @@ -71,14 +71,6 @@ do { \ @@ -53,11 +60,9 @@ index 3ed4e8771b64..a437b2e70d37 100644 #if defined(CONFIG_IRQSOFF_TRACER) || \ defined(CONFIG_PREEMPT_TRACER) extern void stop_critical_timings(void); -diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c -index 38d7c03e694c..4d515978dae9 100644 --- a/kernel/locking/lockdep.c +++ b/kernel/locking/lockdep.c -@@ -5292,6 +5292,7 @@ static noinstr void check_flags(unsigned long flags) +@@ -5473,6 +5473,7 @@ static noinstr void check_flags(unsigned } } @@ -65,7 +70,7 @@ index 38d7c03e694c..4d515978dae9 100644 /* * We dont accurately track softirq state in e.g. * hardirq contexts (such as on 4KSTACKS), so only -@@ -5306,6 +5307,7 @@ static noinstr void check_flags(unsigned long flags) +@@ -5487,6 +5488,7 @@ static noinstr void check_flags(unsigned DEBUG_LOCKS_WARN_ON(!current->softirqs_enabled); } } @@ -73,6 +78,3 @@ index 38d7c03e694c..4d515978dae9 100644 if (!debug_locks) print_irqtrace_events(current); --- -2.30.2 - diff --git a/debian/patches-rt/0005-sched-core-Wait-for-tasks-being-pushed-away-on-hotpl.patch b/debian/patches-rt/0005-sched-core-Wait-for-tasks-being-pushed-away-on-hotpl.patch deleted file mode 100644 index da47b13d2..000000000 --- a/debian/patches-rt/0005-sched-core-Wait-for-tasks-being-pushed-away-on-hotpl.patch +++ /dev/null @@ -1,124 +0,0 @@ -From 921cbc00825ced6ea9a1b029a41a8926480d9ab6 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Fri, 23 Oct 2020 12:12:02 +0200 -Subject: [PATCH 005/296] sched/core: Wait for tasks being pushed away on - hotplug -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -RT kernels need to ensure that all tasks which are not per CPU kthreads -have left the outgoing CPU to guarantee that no tasks are force migrated -within a migrate disabled section. - -There is also some desire to (ab)use fine grained CPU hotplug control to -clear a CPU from active state to force migrate tasks which are not per CPU -kthreads away for power control purposes. - -Add a mechanism which waits until all tasks which should leave the CPU -after the CPU active flag is cleared have moved to a different online CPU. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/sched/core.c | 40 +++++++++++++++++++++++++++++++++++++++- - kernel/sched/sched.h | 4 ++++ - 2 files changed, 43 insertions(+), 1 deletion(-) - -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index feca16c2a780..116f9797a6d0 100644 ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -6894,8 +6894,21 @@ static void balance_push(struct rq *rq) - * Both the cpu-hotplug and stop task are in this case and are - * required to complete the hotplug process. - */ -- if (is_per_cpu_kthread(push_task)) -+ if (is_per_cpu_kthread(push_task)) { -+ /* -+ * If this is the idle task on the outgoing CPU try to wake -+ * up the hotplug control thread which might wait for the -+ * last task to vanish. The rcuwait_active() check is -+ * accurate here because the waiter is pinned on this CPU -+ * and can't obviously be running in parallel. -+ */ -+ if (!rq->nr_running && rcuwait_active(&rq->hotplug_wait)) { -+ raw_spin_unlock(&rq->lock); -+ rcuwait_wake_up(&rq->hotplug_wait); -+ raw_spin_lock(&rq->lock); -+ } - return; -+ } - - get_task_struct(push_task); - /* -@@ -6926,6 +6939,20 @@ static void balance_push_set(int cpu, bool on) - rq_unlock_irqrestore(rq, &rf); - } - -+/* -+ * Invoked from a CPUs hotplug control thread after the CPU has been marked -+ * inactive. All tasks which are not per CPU kernel threads are either -+ * pushed off this CPU now via balance_push() or placed on a different CPU -+ * during wakeup. Wait until the CPU is quiescent. -+ */ -+static void balance_hotplug_wait(void) -+{ -+ struct rq *rq = this_rq(); -+ -+ rcuwait_wait_event(&rq->hotplug_wait, rq->nr_running == 1, -+ TASK_UNINTERRUPTIBLE); -+} -+ - #else - - static inline void balance_push(struct rq *rq) -@@ -6936,6 +6963,10 @@ static inline void balance_push_set(int cpu, bool on) - { - } - -+static inline void balance_hotplug_wait(void) -+{ -+} -+ - #endif /* CONFIG_HOTPLUG_CPU */ - - void set_rq_online(struct rq *rq) -@@ -7090,6 +7121,10 @@ int sched_cpu_deactivate(unsigned int cpu) - return ret; - } - sched_domains_numa_masks_clear(cpu); -+ -+ /* Wait for all non per CPU kernel threads to vanish. */ -+ balance_hotplug_wait(); -+ - return 0; - } - -@@ -7330,6 +7365,9 @@ void __init sched_init(void) - - rq_csd_init(rq, &rq->nohz_csd, nohz_csd_func); - #endif -+#ifdef CONFIG_HOTPLUG_CPU -+ rcuwait_init(&rq->hotplug_wait); -+#endif - #endif /* CONFIG_SMP */ - hrtick_rq_init(rq); - atomic_set(&rq->nr_iowait, 0); -diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h -index ca3106f66351..7dcd5ac722ec 100644 ---- a/kernel/sched/sched.h -+++ b/kernel/sched/sched.h -@@ -998,6 +998,10 @@ struct rq { - - /* This is used to determine avg_idle's max value */ - u64 max_idle_balance_cost; -+ -+#ifdef CONFIG_HOTPLUG_CPU -+ struct rcuwait hotplug_wait; -+#endif - #endif /* CONFIG_SMP */ - - #ifdef CONFIG_IRQ_TIME_ACCOUNTING --- -2.30.2 - diff --git a/debian/patches-rt/0005-u64_stats-Introduce-u64_stats_set.patch b/debian/patches-rt/0005-u64_stats-Introduce-u64_stats_set.patch new file mode 100644 index 000000000..bfee561b6 --- /dev/null +++ b/debian/patches-rt/0005-u64_stats-Introduce-u64_stats_set.patch @@ -0,0 +1,45 @@ +From: "Ahmed S. Darwish" <a.darwish@linutronix.de> +Date: Sat, 16 Oct 2021 10:49:06 +0200 +Subject: [PATCH 5/9] u64_stats: Introduce u64_stats_set() +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +Allow to directly set a u64_stats_t value which is used to provide an init +function which sets it directly to zero intead of memset() the value. + +Add u64_stats_set() to the u64_stats API. + +[bigeasy: commit message. ] + +Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: David S. Miller <davem@davemloft.net> +--- + include/linux/u64_stats_sync.h | 10 ++++++++++ + 1 file changed, 10 insertions(+) + +--- a/include/linux/u64_stats_sync.h ++++ b/include/linux/u64_stats_sync.h +@@ -83,6 +83,11 @@ static inline u64 u64_stats_read(const u + return local64_read(&p->v); + } + ++static inline void u64_stats_set(u64_stats_t *p, u64 val) ++{ ++ local64_set(&p->v, val); ++} ++ + static inline void u64_stats_add(u64_stats_t *p, unsigned long val) + { + local64_add(val, &p->v); +@@ -104,6 +109,11 @@ static inline u64 u64_stats_read(const u + return p->v; + } + ++static inline void u64_stats_set(u64_stats_t *p, u64 val) ++{ ++ p->v = val; ++} ++ + static inline void u64_stats_add(u64_stats_t *p, unsigned long val) + { + p->v += val; diff --git a/debian/patches-rt/0005_kcov_replace_local_irq_save_with_a_local_lock_t.patch b/debian/patches-rt/0005_kcov_replace_local_irq_save_with_a_local_lock_t.patch new file mode 100644 index 000000000..d3f0ab63d --- /dev/null +++ b/debian/patches-rt/0005_kcov_replace_local_irq_save_with_a_local_lock_t.patch @@ -0,0 +1,159 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Subject: kcov: Replace local_irq_save() with a local_lock_t. +Date: Mon, 30 Aug 2021 19:26:27 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The kcov code mixes local_irq_save() and spin_lock() in +kcov_remote_{start|end}(). This creates a warning on PREEMPT_RT because +local_irq_save() disables interrupts and spin_lock_t is turned into a +sleeping lock which can not be acquired in a section with disabled +interrupts. + +The kcov_remote_lock is used to synchronize the access to the hash-list +kcov_remote_map. The local_irq_save() block protects access to the +per-CPU data kcov_percpu_data. + +There no compelling reason to change the lock type to raw_spin_lock_t to +make it work with local_irq_save(). Changing it would require to move +memory allocation (in kcov_remote_add()) and deallocation outside of the +locked section. +Adding an unlimited amount of entries to the hashlist will increase the +IRQ-off time during lookup. It could be argued that this is debug code +and the latency does not matter. There is however no need to do so and +it would allow to use this facility in an RT enabled build. + +Using a local_lock_t instead of local_irq_save() has the befit of adding +a protection scope within the source which makes it obvious what is +protected. On a !PREEMPT_RT && !LOCKDEP build the local_lock_irqsave() +maps directly to local_irq_save() so there is overhead at runtime. + +Replace the local_irq_save() section with a local_lock_t. + +Reported-by: Clark Williams <williams@redhat.com> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20210830172627.267989-6-bigeasy@linutronix.de +--- + kernel/kcov.c | 30 +++++++++++++++++------------- + 1 file changed, 17 insertions(+), 13 deletions(-) + +--- a/kernel/kcov.c ++++ b/kernel/kcov.c +@@ -88,6 +88,7 @@ static struct list_head kcov_remote_area + + struct kcov_percpu_data { + void *irq_area; ++ local_lock_t lock; + + unsigned int saved_mode; + unsigned int saved_size; +@@ -96,7 +97,9 @@ struct kcov_percpu_data { + int saved_sequence; + }; + +-static DEFINE_PER_CPU(struct kcov_percpu_data, kcov_percpu_data); ++static DEFINE_PER_CPU(struct kcov_percpu_data, kcov_percpu_data) = { ++ .lock = INIT_LOCAL_LOCK(lock), ++}; + + /* Must be called with kcov_remote_lock locked. */ + static struct kcov_remote *kcov_remote_find(u64 handle) +@@ -824,7 +827,7 @@ void kcov_remote_start(u64 handle) + if (!in_task() && !in_serving_softirq()) + return; + +- local_irq_save(flags); ++ local_lock_irqsave(&kcov_percpu_data.lock, flags); + + /* + * Check that kcov_remote_start() is not called twice in background +@@ -832,7 +835,7 @@ void kcov_remote_start(u64 handle) + */ + mode = READ_ONCE(t->kcov_mode); + if (WARN_ON(in_task() && kcov_mode_enabled(mode))) { +- local_irq_restore(flags); ++ local_unlock_irqrestore(&kcov_percpu_data.lock, flags); + return; + } + /* +@@ -841,14 +844,15 @@ void kcov_remote_start(u64 handle) + * happened while collecting coverage from a background thread. + */ + if (WARN_ON(in_serving_softirq() && t->kcov_softirq)) { +- local_irq_restore(flags); ++ local_unlock_irqrestore(&kcov_percpu_data.lock, flags); + return; + } + + spin_lock(&kcov_remote_lock); + remote = kcov_remote_find(handle); + if (!remote) { +- spin_unlock_irqrestore(&kcov_remote_lock, flags); ++ spin_unlock(&kcov_remote_lock); ++ local_unlock_irqrestore(&kcov_percpu_data.lock, flags); + return; + } + kcov_debug("handle = %llx, context: %s\n", handle, +@@ -873,13 +877,13 @@ void kcov_remote_start(u64 handle) + + /* Can only happen when in_task(). */ + if (!area) { +- local_irqrestore(flags); ++ local_unlock_irqrestore(&kcov_percpu_data.lock, flags); + area = vmalloc(size * sizeof(unsigned long)); + if (!area) { + kcov_put(kcov); + return; + } +- local_irq_save(flags); ++ local_lock_irqsave(&kcov_percpu_data.lock, flags); + } + + /* Reset coverage size. */ +@@ -891,7 +895,7 @@ void kcov_remote_start(u64 handle) + } + kcov_start(t, kcov, size, area, mode, sequence); + +- local_irq_restore(flags); ++ local_unlock_irqrestore(&kcov_percpu_data.lock, flags); + + } + EXPORT_SYMBOL(kcov_remote_start); +@@ -965,12 +969,12 @@ void kcov_remote_stop(void) + if (!in_task() && !in_serving_softirq()) + return; + +- local_irq_save(flags); ++ local_lock_irqsave(&kcov_percpu_data.lock, flags); + + mode = READ_ONCE(t->kcov_mode); + barrier(); + if (!kcov_mode_enabled(mode)) { +- local_irq_restore(flags); ++ local_unlock_irqrestore(&kcov_percpu_data.lock, flags); + return; + } + /* +@@ -978,12 +982,12 @@ void kcov_remote_stop(void) + * actually found the remote handle and started collecting coverage. + */ + if (in_serving_softirq() && !t->kcov_softirq) { +- local_irq_restore(flags); ++ local_unlock_irqrestore(&kcov_percpu_data.lock, flags); + return; + } + /* Make sure that kcov_softirq is only set when in softirq. */ + if (WARN_ON(!in_serving_softirq() && t->kcov_softirq)) { +- local_irq_restore(flags); ++ local_unlock_irqrestore(&kcov_percpu_data.lock, flags); + return; + } + +@@ -1013,7 +1017,7 @@ void kcov_remote_stop(void) + spin_unlock(&kcov_remote_lock); + } + +- local_irq_restore(flags); ++ local_unlock_irqrestore(&kcov_percpu_data.lock, flags); + + /* Get in kcov_remote_start(). */ + kcov_put(kcov); diff --git a/debian/patches-rt/0005_sched_make_might_sleep_output_less_confusing.patch b/debian/patches-rt/0005_sched_make_might_sleep_output_less_confusing.patch new file mode 100644 index 000000000..e1dee86f6 --- /dev/null +++ b/debian/patches-rt/0005_sched_make_might_sleep_output_less_confusing.patch @@ -0,0 +1,134 @@ +From: Thomas Gleixner <tglx@linutronix.de> +Subject: sched: Make might_sleep() output less confusing +Date: Thu, 23 Sep 2021 18:54:41 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +might_sleep() output is pretty informative, but can be confusing at times +especially with PREEMPT_RCU when the check triggers due to a voluntary +sleep inside a RCU read side critical section: + + BUG: sleeping function called from invalid context at kernel/test.c:110 + in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 415, name: kworker/u112:52 + Preemption disabled at: migrate_disable+0x33/0xa0 + +in_atomic() is 0, but it still tells that preemption was disabled at +migrate_disable(), which is completely useless because preemption is not +disabled. But the interesting information to decode the above, i.e. the RCU +nesting depth, is not printed. + +That becomes even more confusing when might_sleep() is invoked from +cond_resched_lock() within a RCU read side critical section. Here the +expected preemption count is 1 and not 0. + + BUG: sleeping function called from invalid context at kernel/test.c:131 + in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 415, name: kworker/u112:52 + Preemption disabled at: test_cond_lock+0xf3/0x1c0 + +So in_atomic() is set, which is expected as the caller holds a spinlock, +but it's unclear why this is broken and the preempt disable IP is just +pointing at the correct place, i.e. spin_lock(), which is obviously not +helpful either. + +Make that more useful in general: + + - Print preempt_count() and the expected value + +and for the CONFIG_PREEMPT_RCU case: + + - Print the RCU read side critical section nesting depth + + - Print the preempt disable IP only when preempt count + does not have the expected value. + +So the might_sleep() dump from a within a preemptible RCU read side +critical section becomes: + + BUG: sleeping function called from invalid context at kernel/test.c:110 + in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 415, name: kworker/u112:52 + preempt_count: 0, expected: 0 + RCU nest depth: 1, expected: 0 + +and the cond_resched_lock() case becomes: + + BUG: sleeping function called from invalid context at kernel/test.c:141 + in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 415, name: kworker/u112:52 + preempt_count: 1, expected: 1 + RCU nest depth: 1, expected: 0 + +which makes is pretty obvious what's going on. For all other cases the +preempt disable IP is still printed as before: + + BUG: sleeping function called from invalid context at kernel/test.c: 156 + in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0 + preempt_count: 1, expected: 0 + RCU nest depth: 0, expected: 0 + Preemption disabled at: + [<ffffffff82b48326>] test_might_sleep+0xbe/0xf8 + + BUG: sleeping function called from invalid context at kernel/test.c: 163 + in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0 + preempt_count: 1, expected: 0 + RCU nest depth: 1, expected: 0 + Preemption disabled at: + [<ffffffff82b48326>] test_might_sleep+0x1e4/0x280 + +This also prepares to provide a better debugging output for RT enabled +kernels and their spinlock substitutions. + +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20210923165358.181022656@linutronix.de +--- + kernel/sched/core.c | 27 ++++++++++++++++++++++----- + 1 file changed, 22 insertions(+), 5 deletions(-) + +--- a/kernel/sched/core.c ++++ b/kernel/sched/core.c +@@ -9493,6 +9493,18 @@ void __might_sleep(const char *file, int + } + EXPORT_SYMBOL(__might_sleep); + ++static void print_preempt_disable_ip(int preempt_offset, unsigned long ip) ++{ ++ if (!IS_ENABLED(CONFIG_DEBUG_PREEMPT)) ++ return; ++ ++ if (preempt_count() == preempt_offset) ++ return; ++ ++ pr_err("Preemption disabled at:"); ++ print_ip_sym(KERN_ERR, ip); ++} ++ + void __might_resched(const char *file, int line, int preempt_offset) + { + /* Ratelimiting timestamp: */ +@@ -9521,6 +9533,13 @@ void __might_resched(const char *file, i + pr_err("in_atomic(): %d, irqs_disabled(): %d, non_block: %d, pid: %d, name: %s\n", + in_atomic(), irqs_disabled(), current->non_block_count, + current->pid, current->comm); ++ pr_err("preempt_count: %x, expected: %x\n", preempt_count(), ++ preempt_offset); ++ ++ if (IS_ENABLED(CONFIG_PREEMPT_RCU)) { ++ pr_err("RCU nest depth: %d, expected: 0\n", ++ rcu_preempt_depth()); ++ } + + if (task_stack_end_corrupted(current)) + pr_emerg("Thread overran stack, or stack corrupted\n"); +@@ -9528,11 +9547,9 @@ void __might_resched(const char *file, i + debug_show_held_locks(current); + if (irqs_disabled()) + print_irqtrace_events(current); +- if (IS_ENABLED(CONFIG_DEBUG_PREEMPT) +- && !preempt_count_equals(preempt_offset)) { +- pr_err("Preemption disabled at:"); +- print_ip_sym(KERN_ERR, preempt_disable_ip); +- } ++ ++ print_preempt_disable_ip(preempt_offset, preempt_disable_ip); ++ + dump_stack(); + add_taint(TAINT_WARN, LOCKDEP_STILL_OK); + } diff --git a/debian/patches-rt/0005_sched_move_mmdrop_to_rcu_on_rt.patch b/debian/patches-rt/0005_sched_move_mmdrop_to_rcu_on_rt.patch new file mode 100644 index 000000000..e0bb37a54 --- /dev/null +++ b/debian/patches-rt/0005_sched_move_mmdrop_to_rcu_on_rt.patch @@ -0,0 +1,105 @@ +From: Thomas Gleixner <tglx@linutronix.de> +Subject: sched: Move mmdrop to RCU on RT +Date: Tue, 28 Sep 2021 14:24:32 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +mmdrop() is invoked from finish_task_switch() by the incoming task to drop +the mm which was handed over by the previous task. mmdrop() can be quite +expensive which prevents an incoming real-time task from getting useful +work done. + +Provide mmdrop_sched() which maps to mmdrop() on !RT kernels. On RT kernels +it delagates the eventually required invocation of __mmdrop() to RCU. + +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20210928122411.648582026@linutronix.de +--- + include/linux/mm_types.h | 4 ++++ + include/linux/sched/mm.h | 20 ++++++++++++++++++++ + kernel/fork.c | 13 +++++++++++++ + kernel/sched/core.c | 2 +- + 4 files changed, 38 insertions(+), 1 deletion(-) +--- +--- a/include/linux/mm_types.h ++++ b/include/linux/mm_types.h +@@ -12,6 +12,7 @@ + #include <linux/completion.h> + #include <linux/cpumask.h> + #include <linux/uprobes.h> ++#include <linux/rcupdate.h> + #include <linux/page-flags-layout.h> + #include <linux/workqueue.h> + #include <linux/seqlock.h> +@@ -572,6 +573,9 @@ struct mm_struct { + bool tlb_flush_batched; + #endif + struct uprobes_state uprobes_state; ++#ifdef CONFIG_PREEMPT_RT ++ struct rcu_head delayed_drop; ++#endif + #ifdef CONFIG_HUGETLB_PAGE + atomic_long_t hugetlb_usage; + #endif +--- a/include/linux/sched/mm.h ++++ b/include/linux/sched/mm.h +@@ -49,6 +49,26 @@ static inline void mmdrop(struct mm_stru + __mmdrop(mm); + } + ++#ifdef CONFIG_PREEMPT_RT ++extern void __mmdrop_delayed(struct rcu_head *rhp); ++ ++/* ++ * Invoked from finish_task_switch(). Delegates the heavy lifting on RT ++ * kernels via RCU. ++ */ ++static inline void mmdrop_sched(struct mm_struct *mm) ++{ ++ /* Provides a full memory barrier. See mmdrop() */ ++ if (atomic_dec_and_test(&mm->mm_count)) ++ call_rcu(&mm->delayed_drop, __mmdrop_delayed); ++} ++#else ++static inline void mmdrop_sched(struct mm_struct *mm) ++{ ++ mmdrop(mm); ++} ++#endif ++ + /** + * mmget() - Pin the address space associated with a &struct mm_struct. + * @mm: The address space to pin. +--- a/kernel/fork.c ++++ b/kernel/fork.c +@@ -708,6 +708,19 @@ void __mmdrop(struct mm_struct *mm) + } + EXPORT_SYMBOL_GPL(__mmdrop); + ++#ifdef CONFIG_PREEMPT_RT ++/* ++ * RCU callback for delayed mm drop. Not strictly RCU, but call_rcu() is ++ * by far the least expensive way to do that. ++ */ ++void __mmdrop_delayed(struct rcu_head *rhp) ++{ ++ struct mm_struct *mm = container_of(rhp, struct mm_struct, delayed_drop); ++ ++ __mmdrop(mm); ++} ++#endif ++ + static void mmdrop_async_fn(struct work_struct *work) + { + struct mm_struct *mm; +--- a/kernel/sched/core.c ++++ b/kernel/sched/core.c +@@ -4839,7 +4839,7 @@ static struct rq *finish_task_switch(str + */ + if (mm) { + membarrier_mm_sync_core_before_usermode(mm); +- mmdrop(mm); ++ mmdrop_sched(mm); + } + if (unlikely(prev_state == TASK_DEAD)) { + if (prev->sched_class->task_dead) diff --git a/debian/patches-rt/0259-drm-i915-disable-tracing-on-RT.patch b/debian/patches-rt/0006-drm-i915-Disable-tracing-points-on-PREEMPT_RT.patch index 81748136c..c80aa18c7 100644 --- a/debian/patches-rt/0259-drm-i915-disable-tracing-on-RT.patch +++ b/debian/patches-rt/0006-drm-i915-Disable-tracing-points-on-PREEMPT_RT.patch @@ -1,8 +1,7 @@ -From f93f12dffe38a88380062a39ad36fea143830d21 Mon Sep 17 00:00:00 2001 From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Date: Thu, 6 Dec 2018 09:52:20 +0100 -Subject: [PATCH 259/296] drm/i915: disable tracing on -RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz +Subject: [PATCH 06/10] drm/i915: Disable tracing points on PREEMPT_RT +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz Luca Abeni reported this: | BUG: scheduling while atomic: kworker/u8:2/15203/0x00000003 @@ -14,21 +13,23 @@ Luca Abeni reported this: | trace_event_raw_event_i915_pipe_update_start+0x7d/0xf0 [i915] The tracing events use trace_i915_pipe_update_start() among other events -use functions acquire spin locks. A few trace points use +use functions acquire spinlock_t locks which are transformed into +sleeping locks on PREEMPT_RT. A few trace points use intel_get_crtc_scanline(), others use ->get_vblank_counter() wich also -might acquire a sleeping lock. +might acquire a sleeping locks on PREEMPT_RT. +At the time the arguments are evaluated within trace point, preemption +is disabled and so the locks must not be acquired on PREEMPT_RT. -Based on this I don't see any other way than disable trace points on RT. +Based on this I don't see any other way than disable trace points on +PREMPT_RT. -Cc: stable-rt@vger.kernel.org Reported-by: Luca Abeni <lucabe72@gmail.com> +Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- - drivers/gpu/drm/i915/i915_trace.h | 4 ++++ + drivers/gpu/drm/i915/i915_trace.h | 4 ++++ 1 file changed, 4 insertions(+) -diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h -index a4addcc64978..0ba5a0a0fd25 100644 --- a/drivers/gpu/drm/i915/i915_trace.h +++ b/drivers/gpu/drm/i915/i915_trace.h @@ -2,6 +2,10 @@ @@ -42,6 +43,3 @@ index a4addcc64978..0ba5a0a0fd25 100644 #include <linux/stringify.h> #include <linux/types.h> #include <linux/tracepoint.h> --- -2.30.2 - diff --git a/debian/patches-rt/0006-lockdep-selftests-Add-rtmutex-to-the-last-column.patch b/debian/patches-rt/0006-lockdep-selftests-Add-rtmutex-to-the-last-column.patch new file mode 100644 index 000000000..75879101f --- /dev/null +++ b/debian/patches-rt/0006-lockdep-selftests-Add-rtmutex-to-the-last-column.patch @@ -0,0 +1,24 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Thu, 12 Aug 2021 16:16:54 +0200 +Subject: [PATCH 06/10] lockdep/selftests: Add rtmutex to the last column +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The last column contains the results for the rtmutex tests. +Add it. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + lib/locking-selftest.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/lib/locking-selftest.c ++++ b/lib/locking-selftest.c +@@ -2812,7 +2812,7 @@ void locking_selftest(void) + printk("------------------------\n"); + printk("| Locking API testsuite:\n"); + printk("----------------------------------------------------------------------------\n"); +- printk(" | spin |wlock |rlock |mutex | wsem | rsem |\n"); ++ printk(" | spin |wlock |rlock |mutex | wsem | rsem |rtmutex\n"); + printk(" --------------------------------------------------------------------------\n"); + + init_shared_classes(); diff --git a/debian/patches-rt/0006-net-sched-Protect-Qdisc-bstats-with-u64_stats.patch b/debian/patches-rt/0006-net-sched-Protect-Qdisc-bstats-with-u64_stats.patch new file mode 100644 index 000000000..f06e4324a --- /dev/null +++ b/debian/patches-rt/0006-net-sched-Protect-Qdisc-bstats-with-u64_stats.patch @@ -0,0 +1,310 @@ +From: "Ahmed S. Darwish" <a.darwish@linutronix.de> +Date: Sat, 16 Oct 2021 10:49:07 +0200 +Subject: [PATCH 6/9] net: sched: Protect Qdisc::bstats with u64_stats +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The not-per-CPU variant of qdisc tc (traffic control) statistics, +Qdisc::gnet_stats_basic_packed bstats, is protected with Qdisc::running +sequence counter. + +This sequence counter is used for reliably protecting bstats reads from +parallel writes. Meanwhile, the seqcount's write section covers a much +wider area than bstats update: qdisc_run_begin() => qdisc_run_end(). + +That read/write section asymmetry can lead to needless retries of the +read section. To prepare for removing the Qdisc::running sequence +counter altogether, introduce a u64_stats sync point inside bstats +instead. + +Modify _bstats_update() to start/end the bstats u64_stats write +section. + +For bisectability, and finer commits granularity, the bstats read +section is still protected with a Qdisc::running read/retry loop and +qdisc_run_begin/end() still starts/ends that seqcount write section. +Once all call sites are modified to use _bstats_update(), the +Qdisc::running seqcount will be removed and bstats read/retry loop will +be modified to utilize the internal u64_stats sync point. + +Note, using u64_stats implies no sequence counter protection for 64-bit +architectures. This can lead to the statistics "packets" vs. "bytes" +values getting out of sync on rare occasions. The individual values will +still be valid. + +[bigeasy: Minor commit message edits, init all gnet_stats_basic_packed.] + +Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: David S. Miller <davem@davemloft.net> +--- + include/net/gen_stats.h | 2 ++ + include/net/sch_generic.h | 2 ++ + net/core/gen_estimator.c | 2 +- + net/core/gen_stats.c | 14 ++++++++++++-- + net/netfilter/xt_RATEEST.c | 1 + + net/sched/act_api.c | 2 ++ + net/sched/sch_atm.c | 1 + + net/sched/sch_cbq.c | 1 + + net/sched/sch_drr.c | 1 + + net/sched/sch_ets.c | 2 +- + net/sched/sch_generic.c | 1 + + net/sched/sch_gred.c | 4 +++- + net/sched/sch_hfsc.c | 1 + + net/sched/sch_htb.c | 7 +++++-- + net/sched/sch_mq.c | 2 +- + net/sched/sch_mqprio.c | 5 +++-- + net/sched/sch_qfq.c | 1 + + 17 files changed, 39 insertions(+), 10 deletions(-) + +--- a/include/net/gen_stats.h ++++ b/include/net/gen_stats.h +@@ -11,6 +11,7 @@ + struct gnet_stats_basic_packed { + __u64 bytes; + __u64 packets; ++ struct u64_stats_sync syncp; + }; + + struct gnet_stats_basic_cpu { +@@ -34,6 +35,7 @@ struct gnet_dump { + struct tc_stats tc_stats; + }; + ++void gnet_stats_basic_packed_init(struct gnet_stats_basic_packed *b); + int gnet_stats_start_copy(struct sk_buff *skb, int type, spinlock_t *lock, + struct gnet_dump *d, int padattr); + +--- a/include/net/sch_generic.h ++++ b/include/net/sch_generic.h +@@ -848,8 +848,10 @@ static inline int qdisc_enqueue(struct s + static inline void _bstats_update(struct gnet_stats_basic_packed *bstats, + __u64 bytes, __u32 packets) + { ++ u64_stats_update_begin(&bstats->syncp); + bstats->bytes += bytes; + bstats->packets += packets; ++ u64_stats_update_end(&bstats->syncp); + } + + static inline void bstats_update(struct gnet_stats_basic_packed *bstats, +--- a/net/core/gen_estimator.c ++++ b/net/core/gen_estimator.c +@@ -62,7 +62,7 @@ struct net_rate_estimator { + static void est_fetch_counters(struct net_rate_estimator *e, + struct gnet_stats_basic_packed *b) + { +- memset(b, 0, sizeof(*b)); ++ gnet_stats_basic_packed_init(b); + if (e->stats_lock) + spin_lock(e->stats_lock); + +--- a/net/core/gen_stats.c ++++ b/net/core/gen_stats.c +@@ -18,7 +18,7 @@ + #include <linux/gen_stats.h> + #include <net/netlink.h> + #include <net/gen_stats.h> +- ++#include <net/sch_generic.h> + + static inline int + gnet_stats_copy(struct gnet_dump *d, int type, void *buf, int size, int padattr) +@@ -114,6 +114,15 @@ gnet_stats_start_copy(struct sk_buff *sk + } + EXPORT_SYMBOL(gnet_stats_start_copy); + ++/* Must not be inlined, due to u64_stats seqcount_t lockdep key */ ++void gnet_stats_basic_packed_init(struct gnet_stats_basic_packed *b) ++{ ++ b->bytes = 0; ++ b->packets = 0; ++ u64_stats_init(&b->syncp); ++} ++EXPORT_SYMBOL(gnet_stats_basic_packed_init); ++ + static void gnet_stats_add_basic_cpu(struct gnet_stats_basic_packed *bstats, + struct gnet_stats_basic_cpu __percpu *cpu) + { +@@ -167,8 +176,9 @@ static int + struct gnet_stats_basic_packed *b, + int type) + { +- struct gnet_stats_basic_packed bstats = {0}; ++ struct gnet_stats_basic_packed bstats; + ++ gnet_stats_basic_packed_init(&bstats); + gnet_stats_add_basic(running, &bstats, cpu, b); + + if (d->compat_tc_stats && type == TCA_STATS_BASIC) { +--- a/net/netfilter/xt_RATEEST.c ++++ b/net/netfilter/xt_RATEEST.c +@@ -143,6 +143,7 @@ static int xt_rateest_tg_checkentry(cons + if (!est) + goto err1; + ++ gnet_stats_basic_packed_init(&est->bstats); + strlcpy(est->name, info->name, sizeof(est->name)); + spin_lock_init(&est->lock); + est->refcnt = 1; +--- a/net/sched/act_api.c ++++ b/net/sched/act_api.c +@@ -490,6 +490,8 @@ int tcf_idr_create(struct tc_action_net + if (!p->cpu_qstats) + goto err3; + } ++ gnet_stats_basic_packed_init(&p->tcfa_bstats); ++ gnet_stats_basic_packed_init(&p->tcfa_bstats_hw); + spin_lock_init(&p->tcfa_lock); + p->tcfa_index = index; + p->tcfa_tm.install = jiffies; +--- a/net/sched/sch_atm.c ++++ b/net/sched/sch_atm.c +@@ -548,6 +548,7 @@ static int atm_tc_init(struct Qdisc *sch + pr_debug("atm_tc_init(sch %p,[qdisc %p],opt %p)\n", sch, p, opt); + INIT_LIST_HEAD(&p->flows); + INIT_LIST_HEAD(&p->link.list); ++ gnet_stats_basic_packed_init(&p->link.bstats); + list_add(&p->link.list, &p->flows); + p->link.q = qdisc_create_dflt(sch->dev_queue, + &pfifo_qdisc_ops, sch->handle, extack); +--- a/net/sched/sch_cbq.c ++++ b/net/sched/sch_cbq.c +@@ -1611,6 +1611,7 @@ cbq_change_class(struct Qdisc *sch, u32 + if (cl == NULL) + goto failure; + ++ gnet_stats_basic_packed_init(&cl->bstats); + err = tcf_block_get(&cl->block, &cl->filter_list, sch, extack); + if (err) { + kfree(cl); +--- a/net/sched/sch_drr.c ++++ b/net/sched/sch_drr.c +@@ -106,6 +106,7 @@ static int drr_change_class(struct Qdisc + if (cl == NULL) + return -ENOBUFS; + ++ gnet_stats_basic_packed_init(&cl->bstats); + cl->common.classid = classid; + cl->quantum = quantum; + cl->qdisc = qdisc_create_dflt(sch->dev_queue, +--- a/net/sched/sch_ets.c ++++ b/net/sched/sch_ets.c +@@ -689,7 +689,7 @@ static int ets_qdisc_change(struct Qdisc + q->classes[i].qdisc = NULL; + q->classes[i].quantum = 0; + q->classes[i].deficit = 0; +- memset(&q->classes[i].bstats, 0, sizeof(q->classes[i].bstats)); ++ gnet_stats_basic_packed_init(&q->classes[i].bstats); + memset(&q->classes[i].qstats, 0, sizeof(q->classes[i].qstats)); + } + return 0; +--- a/net/sched/sch_generic.c ++++ b/net/sched/sch_generic.c +@@ -892,6 +892,7 @@ struct Qdisc *qdisc_alloc(struct netdev_ + __skb_queue_head_init(&sch->gso_skb); + __skb_queue_head_init(&sch->skb_bad_txq); + qdisc_skb_head_init(&sch->q); ++ gnet_stats_basic_packed_init(&sch->bstats); + spin_lock_init(&sch->q.lock); + + if (ops->static_flags & TCQ_F_CPUSTATS) { +--- a/net/sched/sch_gred.c ++++ b/net/sched/sch_gred.c +@@ -364,9 +364,11 @@ static int gred_offload_dump_stats(struc + hw_stats->handle = sch->handle; + hw_stats->parent = sch->parent; + +- for (i = 0; i < MAX_DPs; i++) ++ for (i = 0; i < MAX_DPs; i++) { ++ gnet_stats_basic_packed_init(&hw_stats->stats.bstats[i]); + if (table->tab[i]) + hw_stats->stats.xstats[i] = &table->tab[i]->stats; ++ } + + ret = qdisc_offload_dump_helper(sch, TC_SETUP_QDISC_GRED, hw_stats); + /* Even if driver returns failure adjust the stats - in case offload +--- a/net/sched/sch_hfsc.c ++++ b/net/sched/sch_hfsc.c +@@ -1406,6 +1406,7 @@ hfsc_init_qdisc(struct Qdisc *sch, struc + if (err) + return err; + ++ gnet_stats_basic_packed_init(&q->root.bstats); + q->root.cl_common.classid = sch->handle; + q->root.sched = q; + q->root.qdisc = qdisc_create_dflt(sch->dev_queue, &pfifo_qdisc_ops, +--- a/net/sched/sch_htb.c ++++ b/net/sched/sch_htb.c +@@ -1311,7 +1311,7 @@ static void htb_offload_aggregate_stats( + struct htb_class *c; + unsigned int i; + +- memset(&cl->bstats, 0, sizeof(cl->bstats)); ++ gnet_stats_basic_packed_init(&cl->bstats); + + for (i = 0; i < q->clhash.hashsize; i++) { + hlist_for_each_entry(c, &q->clhash.hash[i], common.hnode) { +@@ -1357,7 +1357,7 @@ htb_dump_class_stats(struct Qdisc *sch, + if (cl->leaf.q) + cl->bstats = cl->leaf.q->bstats; + else +- memset(&cl->bstats, 0, sizeof(cl->bstats)); ++ gnet_stats_basic_packed_init(&cl->bstats); + cl->bstats.bytes += cl->bstats_bias.bytes; + cl->bstats.packets += cl->bstats_bias.packets; + } else { +@@ -1849,6 +1849,9 @@ static int htb_change_class(struct Qdisc + if (!cl) + goto failure; + ++ gnet_stats_basic_packed_init(&cl->bstats); ++ gnet_stats_basic_packed_init(&cl->bstats_bias); ++ + err = tcf_block_get(&cl->block, &cl->filter_list, sch, extack); + if (err) { + kfree(cl); +--- a/net/sched/sch_mq.c ++++ b/net/sched/sch_mq.c +@@ -132,7 +132,7 @@ static int mq_dump(struct Qdisc *sch, st + unsigned int ntx; + + sch->q.qlen = 0; +- memset(&sch->bstats, 0, sizeof(sch->bstats)); ++ gnet_stats_basic_packed_init(&sch->bstats); + memset(&sch->qstats, 0, sizeof(sch->qstats)); + + /* MQ supports lockless qdiscs. However, statistics accounting needs +--- a/net/sched/sch_mqprio.c ++++ b/net/sched/sch_mqprio.c +@@ -390,7 +390,7 @@ static int mqprio_dump(struct Qdisc *sch + unsigned int ntx, tc; + + sch->q.qlen = 0; +- memset(&sch->bstats, 0, sizeof(sch->bstats)); ++ gnet_stats_basic_packed_init(&sch->bstats); + memset(&sch->qstats, 0, sizeof(sch->qstats)); + + /* MQ supports lockless qdiscs. However, statistics accounting needs +@@ -500,10 +500,11 @@ static int mqprio_dump_class_stats(struc + int i; + __u32 qlen; + struct gnet_stats_queue qstats = {0}; +- struct gnet_stats_basic_packed bstats = {0}; ++ struct gnet_stats_basic_packed bstats; + struct net_device *dev = qdisc_dev(sch); + struct netdev_tc_txq tc = dev->tc_to_txq[cl & TC_BITMASK]; + ++ gnet_stats_basic_packed_init(&bstats); + /* Drop lock here it will be reclaimed before touching + * statistics this is required because the d->lock we + * hold here is the look on dev_queue->qdisc_sleeping +--- a/net/sched/sch_qfq.c ++++ b/net/sched/sch_qfq.c +@@ -465,6 +465,7 @@ static int qfq_change_class(struct Qdisc + if (cl == NULL) + return -ENOBUFS; + ++ gnet_stats_basic_packed_init(&cl->bstats); + cl->common.classid = classid; + cl->deficit = lmax; + diff --git a/debian/patches-rt/0006-workqueue-Manually-break-affinity-on-hotplug.patch b/debian/patches-rt/0006-workqueue-Manually-break-affinity-on-hotplug.patch deleted file mode 100644 index 676678e9d..000000000 --- a/debian/patches-rt/0006-workqueue-Manually-break-affinity-on-hotplug.patch +++ /dev/null @@ -1,34 +0,0 @@ -From fee53323b677006531d9119aee46bddbbb33acaf Mon Sep 17 00:00:00 2001 -From: Peter Zijlstra <peterz@infradead.org> -Date: Fri, 23 Oct 2020 12:12:03 +0200 -Subject: [PATCH 006/296] workqueue: Manually break affinity on hotplug -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Don't rely on the scheduler to force break affinity for us -- it will -stop doing that for per-cpu-kthreads. - -Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> -Acked-by: Tejun Heo <tj@kernel.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/workqueue.c | 4 ++++ - 1 file changed, 4 insertions(+) - -diff --git a/kernel/workqueue.c b/kernel/workqueue.c -index 1e2ca744dadb..c9a0c961d6e0 100644 ---- a/kernel/workqueue.c -+++ b/kernel/workqueue.c -@@ -4912,6 +4912,10 @@ static void unbind_workers(int cpu) - pool->flags |= POOL_DISASSOCIATED; - - raw_spin_unlock_irq(&pool->lock); -+ -+ for_each_pool_worker(worker, pool) -+ WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, cpu_active_mask) < 0); -+ - mutex_unlock(&wq_pool_attach_mutex); - - /* --- -2.30.2 - diff --git a/debian/patches-rt/0006_sched_make_rcu_nest_depth_distinct_in___might_resched.patch b/debian/patches-rt/0006_sched_make_rcu_nest_depth_distinct_in___might_resched.patch new file mode 100644 index 000000000..1aa6fab12 --- /dev/null +++ b/debian/patches-rt/0006_sched_make_rcu_nest_depth_distinct_in___might_resched.patch @@ -0,0 +1,124 @@ +From: Thomas Gleixner <tglx@linutronix.de> +Subject: sched: Make RCU nest depth distinct in __might_resched() +Date: Thu, 23 Sep 2021 18:54:43 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +For !RT kernels RCU nest depth in __might_resched() is always expected to +be 0, but on RT kernels it can be non zero while the preempt count is +expected to be always 0. + +Instead of playing magic games in interpreting the 'preempt_offset' +argument, rename it to 'offsets' and use the lower 8 bits for the expected +preempt count, allow to hand in the expected RCU nest depth in the upper +bits and adopt the __might_resched() code and related checks and printks. + +The affected call sites are updated in subsequent steps. + +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20210923165358.243232823@linutronix.de +--- + include/linux/kernel.h | 4 ++-- + include/linux/sched.h | 3 +++ + kernel/sched/core.c | 28 ++++++++++++++++------------ + 3 files changed, 21 insertions(+), 14 deletions(-) + +--- a/include/linux/kernel.h ++++ b/include/linux/kernel.h +@@ -111,7 +111,7 @@ static __always_inline void might_resche + #endif /* CONFIG_PREEMPT_* */ + + #ifdef CONFIG_DEBUG_ATOMIC_SLEEP +-extern void __might_resched(const char *file, int line, int preempt_offset); ++extern void __might_resched(const char *file, int line, unsigned int offsets); + extern void __might_sleep(const char *file, int line); + extern void __cant_sleep(const char *file, int line, int preempt_offset); + extern void __cant_migrate(const char *file, int line); +@@ -169,7 +169,7 @@ extern void __cant_migrate(const char *f + # define non_block_end() WARN_ON(current->non_block_count-- == 0) + #else + static inline void __might_resched(const char *file, int line, +- int preempt_offset) { } ++ unsigned int offsets) { } + static inline void __might_sleep(const char *file, int line) { } + # define might_sleep() do { might_resched(); } while (0) + # define cant_sleep() do { } while (0) +--- a/include/linux/sched.h ++++ b/include/linux/sched.h +@@ -2057,6 +2057,9 @@ extern int __cond_resched_lock(spinlock_ + extern int __cond_resched_rwlock_read(rwlock_t *lock); + extern int __cond_resched_rwlock_write(rwlock_t *lock); + ++#define MIGHT_RESCHED_RCU_SHIFT 8 ++#define MIGHT_RESCHED_PREEMPT_MASK ((1U << MIGHT_RESCHED_RCU_SHIFT) - 1) ++ + #define cond_resched_lock(lock) ({ \ + __might_resched(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET); \ + __cond_resched_lock(lock); \ +--- a/kernel/sched/core.c ++++ b/kernel/sched/core.c +@@ -9468,12 +9468,6 @@ void __init sched_init(void) + } + + #ifdef CONFIG_DEBUG_ATOMIC_SLEEP +-static inline int preempt_count_equals(int preempt_offset) +-{ +- int nested = preempt_count() + rcu_preempt_depth(); +- +- return (nested == preempt_offset); +-} + + void __might_sleep(const char *file, int line) + { +@@ -9505,7 +9499,16 @@ static void print_preempt_disable_ip(int + print_ip_sym(KERN_ERR, ip); + } + +-void __might_resched(const char *file, int line, int preempt_offset) ++static inline bool resched_offsets_ok(unsigned int offsets) ++{ ++ unsigned int nested = preempt_count(); ++ ++ nested += rcu_preempt_depth() << MIGHT_RESCHED_RCU_SHIFT; ++ ++ return nested == offsets; ++} ++ ++void __might_resched(const char *file, int line, unsigned int offsets) + { + /* Ratelimiting timestamp: */ + static unsigned long prev_jiffy; +@@ -9515,7 +9518,7 @@ void __might_resched(const char *file, i + /* WARN_ON_ONCE() by default, no rate limit required: */ + rcu_sleep_check(); + +- if ((preempt_count_equals(preempt_offset) && !irqs_disabled() && ++ if ((resched_offsets_ok(offsets) && !irqs_disabled() && + !is_idle_task(current) && !current->non_block_count) || + system_state == SYSTEM_BOOTING || system_state > SYSTEM_RUNNING || + oops_in_progress) +@@ -9534,11 +9537,11 @@ void __might_resched(const char *file, i + in_atomic(), irqs_disabled(), current->non_block_count, + current->pid, current->comm); + pr_err("preempt_count: %x, expected: %x\n", preempt_count(), +- preempt_offset); ++ offsets & MIGHT_RESCHED_PREEMPT_MASK); + + if (IS_ENABLED(CONFIG_PREEMPT_RCU)) { +- pr_err("RCU nest depth: %d, expected: 0\n", +- rcu_preempt_depth()); ++ pr_err("RCU nest depth: %d, expected: %u\n", ++ rcu_preempt_depth(), offsets >> MIGHT_RESCHED_RCU_SHIFT); + } + + if (task_stack_end_corrupted(current)) +@@ -9548,7 +9551,8 @@ void __might_resched(const char *file, i + if (irqs_disabled()) + print_irqtrace_events(current); + +- print_preempt_disable_ip(preempt_offset, preempt_disable_ip); ++ print_preempt_disable_ip(offsets & MIGHT_RESCHED_PREEMPT_MASK, ++ preempt_disable_ip); + + dump_stack(); + add_taint(TAINT_WARN, LOCKDEP_STILL_OK); diff --git a/debian/patches-rt/0260-drm-i915-skip-DRM_I915_LOW_LEVEL_TRACEPOINTS-with-NO.patch b/debian/patches-rt/0007-drm-i915-skip-DRM_I915_LOW_LEVEL_TRACEPOINTS-with-NO.patch index 504d6093c..8a833fdf0 100644 --- a/debian/patches-rt/0260-drm-i915-skip-DRM_I915_LOW_LEVEL_TRACEPOINTS-with-NO.patch +++ b/debian/patches-rt/0007-drm-i915-skip-DRM_I915_LOW_LEVEL_TRACEPOINTS-with-NO.patch @@ -1,33 +1,29 @@ -From 07628f5384e5fde829524295db7b2a91c280d53d Mon Sep 17 00:00:00 2001 From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Date: Wed, 19 Dec 2018 10:47:02 +0100 -Subject: [PATCH 260/296] drm/i915: skip DRM_I915_LOW_LEVEL_TRACEPOINTS with +Subject: [PATCH 07/10] drm/i915: skip DRM_I915_LOW_LEVEL_TRACEPOINTS with NOTRACE -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz The order of the header files is important. If this header file is included after tracepoint.h was included then the NOTRACE here becomes a nop. Currently this happens for two .c files which use the tracepoitns behind DRM_I915_LOW_LEVEL_TRACEPOINTS. +Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- - drivers/gpu/drm/i915/i915_trace.h | 2 +- + drivers/gpu/drm/i915/i915_trace.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h -index 0ba5a0a0fd25..396b6598694d 100644 --- a/drivers/gpu/drm/i915/i915_trace.h +++ b/drivers/gpu/drm/i915/i915_trace.h -@@ -782,7 +782,7 @@ DEFINE_EVENT(i915_request, i915_request_add, - TP_ARGS(rq) +@@ -826,7 +826,7 @@ DEFINE_EVENT(i915_request, i915_request_ + TP_ARGS(rq) ); -#if defined(CONFIG_DRM_I915_LOW_LEVEL_TRACEPOINTS) +#if defined(CONFIG_DRM_I915_LOW_LEVEL_TRACEPOINTS) && !defined(NOTRACE) - DEFINE_EVENT(i915_request, i915_request_submit, + DEFINE_EVENT(i915_request, i915_request_guc_submit, TP_PROTO(struct i915_request *rq), TP_ARGS(rq) --- -2.30.2 - diff --git a/debian/patches-rt/0007-lockdep-selftests-Unbalanced-migrate_disable-rcu_rea.patch b/debian/patches-rt/0007-lockdep-selftests-Unbalanced-migrate_disable-rcu_rea.patch new file mode 100644 index 000000000..81a042c22 --- /dev/null +++ b/debian/patches-rt/0007-lockdep-selftests-Unbalanced-migrate_disable-rcu_rea.patch @@ -0,0 +1,83 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Thu, 12 Aug 2021 14:25:38 +0200 +Subject: [PATCH 07/10] lockdep/selftests: Unbalanced migrate_disable() & + rcu_read_lock() +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The tests with unbalanced lock() + unlock() operation leave a modified +preemption counter behind which is then reset to its original value +after the test. + +The spin_lock() function on PREEMPT_RT does not include a +preempt_disable() statement but migrate_disable() and read_rcu_lock(). +As a consequence both counter never get back to their original value and +system explodes later after the selftest. +In the double-unlock case on PREEMPT_RT, the migrate_disable() and RCU +code will trigger which should be avoided. These counter should not be +decremented below their initial value. + +Save both counters and bring them back to their original value after the +test. +In the double-unlock case, increment both counter in advance to they +become balanced after the double unlock. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + lib/locking-selftest.c | 26 +++++++++++++++++++++++++- + 1 file changed, 25 insertions(+), 1 deletion(-) + +--- a/lib/locking-selftest.c ++++ b/lib/locking-selftest.c +@@ -712,12 +712,18 @@ GENERATE_TESTCASE(ABCDBCDA_rtmutex); + + #undef E + ++#ifdef CONFIG_PREEMPT_RT ++# define RT_PREPARE_DBL_UNLOCK() { migrate_disable(); rcu_read_lock(); } ++#else ++# define RT_PREPARE_DBL_UNLOCK() ++#endif + /* + * Double unlock: + */ + #define E() \ + \ + LOCK(A); \ ++ RT_PREPARE_DBL_UNLOCK(); \ + UNLOCK(A); \ + UNLOCK(A); /* fail */ + +@@ -1398,7 +1404,13 @@ static int unexpected_testcase_failures; + + static void dotest(void (*testcase_fn)(void), int expected, int lockclass_mask) + { +- unsigned long saved_preempt_count = preempt_count(); ++ int saved_preempt_count = preempt_count(); ++#ifdef CONFIG_PREEMPT_RT ++#ifdef CONFIG_SMP ++ int saved_mgd_count = current->migration_disabled; ++#endif ++ int saved_rcu_count = current->rcu_read_lock_nesting; ++#endif + + WARN_ON(irqs_disabled()); + +@@ -1432,6 +1444,18 @@ static void dotest(void (*testcase_fn)(v + * count, so restore it: + */ + preempt_count_set(saved_preempt_count); ++ ++#ifdef CONFIG_PREEMPT_RT ++#ifdef CONFIG_SMP ++ while (current->migration_disabled > saved_mgd_count) ++ migrate_enable(); ++#endif ++ ++ while (current->rcu_read_lock_nesting > saved_rcu_count) ++ rcu_read_unlock(); ++ WARN_ON_ONCE(current->rcu_read_lock_nesting < saved_rcu_count); ++#endif ++ + #ifdef CONFIG_TRACE_IRQFLAGS + if (softirq_count()) + current->softirqs_enabled = 0; diff --git a/debian/patches-rt/0007-net-sched-Use-_bstats_update-set-instead-of-raw-writ.patch b/debian/patches-rt/0007-net-sched-Use-_bstats_update-set-instead-of-raw-writ.patch new file mode 100644 index 000000000..fb9d09e16 --- /dev/null +++ b/debian/patches-rt/0007-net-sched-Use-_bstats_update-set-instead-of-raw-writ.patch @@ -0,0 +1,178 @@ +From: "Ahmed S. Darwish" <a.darwish@linutronix.de> +Date: Sat, 16 Oct 2021 10:49:08 +0200 +Subject: [PATCH 7/9] net: sched: Use _bstats_update/set() instead of raw + writes +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The Qdisc::running sequence counter, used to protect Qdisc::bstats reads +from parallel writes, is in the process of being removed. Qdisc::bstats +read/writes will synchronize using an internal u64_stats sync point +instead. + +Modify all bstats writes to use _bstats_update(). This ensures that +the internal u64_stats sync point is always acquired and released as +appropriate. + +Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: David S. Miller <davem@davemloft.net> +--- + net/core/gen_stats.c | 9 +++++---- + net/sched/sch_cbq.c | 3 +-- + net/sched/sch_gred.c | 7 ++++--- + net/sched/sch_htb.c | 25 +++++++++++++++---------- + net/sched/sch_qfq.c | 3 +-- + 5 files changed, 26 insertions(+), 21 deletions(-) + +--- a/net/core/gen_stats.c ++++ b/net/core/gen_stats.c +@@ -126,6 +126,7 @@ EXPORT_SYMBOL(gnet_stats_basic_packed_in + static void gnet_stats_add_basic_cpu(struct gnet_stats_basic_packed *bstats, + struct gnet_stats_basic_cpu __percpu *cpu) + { ++ u64 t_bytes = 0, t_packets = 0; + int i; + + for_each_possible_cpu(i) { +@@ -139,9 +140,10 @@ static void gnet_stats_add_basic_cpu(str + packets = bcpu->bstats.packets; + } while (u64_stats_fetch_retry_irq(&bcpu->syncp, start)); + +- bstats->bytes += bytes; +- bstats->packets += packets; ++ t_bytes += bytes; ++ t_packets += packets; + } ++ _bstats_update(bstats, t_bytes, t_packets); + } + + void gnet_stats_add_basic(const seqcount_t *running, +@@ -164,8 +166,7 @@ void gnet_stats_add_basic(const seqcount + packets = b->packets; + } while (running && read_seqcount_retry(running, seq)); + +- bstats->bytes += bytes; +- bstats->packets += packets; ++ _bstats_update(bstats, bytes, packets); + } + EXPORT_SYMBOL(gnet_stats_add_basic); + +--- a/net/sched/sch_cbq.c ++++ b/net/sched/sch_cbq.c +@@ -565,8 +565,7 @@ cbq_update(struct cbq_sched_data *q) + long avgidle = cl->avgidle; + long idle; + +- cl->bstats.packets++; +- cl->bstats.bytes += len; ++ _bstats_update(&cl->bstats, len, 1); + + /* + * (now - last) is total time between packet right edges. +--- a/net/sched/sch_gred.c ++++ b/net/sched/sch_gred.c +@@ -353,6 +353,7 @@ static int gred_offload_dump_stats(struc + { + struct gred_sched *table = qdisc_priv(sch); + struct tc_gred_qopt_offload *hw_stats; ++ u64 bytes = 0, packets = 0; + unsigned int i; + int ret; + +@@ -381,15 +382,15 @@ static int gred_offload_dump_stats(struc + table->tab[i]->bytesin += hw_stats->stats.bstats[i].bytes; + table->tab[i]->backlog += hw_stats->stats.qstats[i].backlog; + +- _bstats_update(&sch->bstats, +- hw_stats->stats.bstats[i].bytes, +- hw_stats->stats.bstats[i].packets); ++ bytes += hw_stats->stats.bstats[i].bytes; ++ packets += hw_stats->stats.bstats[i].packets; + sch->qstats.qlen += hw_stats->stats.qstats[i].qlen; + sch->qstats.backlog += hw_stats->stats.qstats[i].backlog; + sch->qstats.drops += hw_stats->stats.qstats[i].drops; + sch->qstats.requeues += hw_stats->stats.qstats[i].requeues; + sch->qstats.overlimits += hw_stats->stats.qstats[i].overlimits; + } ++ _bstats_update(&sch->bstats, bytes, packets); + + kfree(hw_stats); + return ret; +--- a/net/sched/sch_htb.c ++++ b/net/sched/sch_htb.c +@@ -1308,6 +1308,7 @@ static int htb_dump_class(struct Qdisc * + static void htb_offload_aggregate_stats(struct htb_sched *q, + struct htb_class *cl) + { ++ u64 bytes = 0, packets = 0; + struct htb_class *c; + unsigned int i; + +@@ -1323,14 +1324,15 @@ static void htb_offload_aggregate_stats( + if (p != cl) + continue; + +- cl->bstats.bytes += c->bstats_bias.bytes; +- cl->bstats.packets += c->bstats_bias.packets; ++ bytes += c->bstats_bias.bytes; ++ packets += c->bstats_bias.packets; + if (c->level == 0) { +- cl->bstats.bytes += c->leaf.q->bstats.bytes; +- cl->bstats.packets += c->leaf.q->bstats.packets; ++ bytes += c->leaf.q->bstats.bytes; ++ packets += c->leaf.q->bstats.packets; + } + } + } ++ _bstats_update(&cl->bstats, bytes, packets); + } + + static int +@@ -1358,8 +1360,9 @@ htb_dump_class_stats(struct Qdisc *sch, + cl->bstats = cl->leaf.q->bstats; + else + gnet_stats_basic_packed_init(&cl->bstats); +- cl->bstats.bytes += cl->bstats_bias.bytes; +- cl->bstats.packets += cl->bstats_bias.packets; ++ _bstats_update(&cl->bstats, ++ cl->bstats_bias.bytes, ++ cl->bstats_bias.packets); + } else { + htb_offload_aggregate_stats(q, cl); + } +@@ -1578,8 +1581,9 @@ static int htb_destroy_class_offload(str + WARN_ON(old != q); + + if (cl->parent) { +- cl->parent->bstats_bias.bytes += q->bstats.bytes; +- cl->parent->bstats_bias.packets += q->bstats.packets; ++ _bstats_update(&cl->parent->bstats_bias, ++ q->bstats.bytes, ++ q->bstats.packets); + } + + offload_opt = (struct tc_htb_qopt_offload) { +@@ -1925,8 +1929,9 @@ static int htb_change_class(struct Qdisc + htb_graft_helper(dev_queue, old_q); + goto err_kill_estimator; + } +- parent->bstats_bias.bytes += old_q->bstats.bytes; +- parent->bstats_bias.packets += old_q->bstats.packets; ++ _bstats_update(&parent->bstats_bias, ++ old_q->bstats.bytes, ++ old_q->bstats.packets); + qdisc_put(old_q); + } + new_q = qdisc_create_dflt(dev_queue, &pfifo_qdisc_ops, +--- a/net/sched/sch_qfq.c ++++ b/net/sched/sch_qfq.c +@@ -1235,8 +1235,7 @@ static int qfq_enqueue(struct sk_buff *s + return err; + } + +- cl->bstats.bytes += len; +- cl->bstats.packets += gso_segs; ++ _bstats_update(&cl->bstats, len, gso_segs); + sch->qstats.backlog += len; + ++sch->q.qlen; + diff --git a/debian/patches-rt/0007-sched-hotplug-Consolidate-task-migration-on-CPU-unpl.patch b/debian/patches-rt/0007-sched-hotplug-Consolidate-task-migration-on-CPU-unpl.patch deleted file mode 100644 index 19b235fd7..000000000 --- a/debian/patches-rt/0007-sched-hotplug-Consolidate-task-migration-on-CPU-unpl.patch +++ /dev/null @@ -1,283 +0,0 @@ -From 38d3ecf0781a070be9fe6cf45ab58f513e9f4138 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Fri, 23 Oct 2020 12:12:04 +0200 -Subject: [PATCH 007/296] sched/hotplug: Consolidate task migration on CPU - unplug -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -With the new mechanism which kicks tasks off the outgoing CPU at the end of -schedule() the situation on an outgoing CPU right before the stopper thread -brings it down completely is: - - - All user tasks and all unbound kernel threads have either been migrated - away or are not running and the next wakeup will move them to a online CPU. - - - All per CPU kernel threads, except cpu hotplug thread and the stopper - thread have either been unbound or parked by the responsible CPU hotplug - callback. - -That means that at the last step before the stopper thread is invoked the -cpu hotplug thread is the last legitimate running task on the outgoing -CPU. - -Add a final wait step right before the stopper thread is kicked which -ensures that any still running tasks on the way to park or on the way to -kick themself of the CPU are either sleeping or gone. - -This allows to remove the migrate_tasks() crutch in sched_cpu_dying(). If -sched_cpu_dying() detects that there is still another running task aside of -the stopper thread then it will explode with the appropriate fireworks. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/cpuhotplug.h | 1 + - include/linux/sched/hotplug.h | 2 + - kernel/cpu.c | 9 +- - kernel/sched/core.c | 154 ++++++++-------------------------- - 4 files changed, 46 insertions(+), 120 deletions(-) - -diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h -index bc56287a1ed1..0042ef362511 100644 ---- a/include/linux/cpuhotplug.h -+++ b/include/linux/cpuhotplug.h -@@ -152,6 +152,7 @@ enum cpuhp_state { - CPUHP_AP_ONLINE, - CPUHP_TEARDOWN_CPU, - CPUHP_AP_ONLINE_IDLE, -+ CPUHP_AP_SCHED_WAIT_EMPTY, - CPUHP_AP_SMPBOOT_THREADS, - CPUHP_AP_X86_VDSO_VMA_ONLINE, - CPUHP_AP_IRQ_AFFINITY_ONLINE, -diff --git a/include/linux/sched/hotplug.h b/include/linux/sched/hotplug.h -index 9a62ffdd296f..412cdaba33eb 100644 ---- a/include/linux/sched/hotplug.h -+++ b/include/linux/sched/hotplug.h -@@ -11,8 +11,10 @@ extern int sched_cpu_activate(unsigned int cpu); - extern int sched_cpu_deactivate(unsigned int cpu); - - #ifdef CONFIG_HOTPLUG_CPU -+extern int sched_cpu_wait_empty(unsigned int cpu); - extern int sched_cpu_dying(unsigned int cpu); - #else -+# define sched_cpu_wait_empty NULL - # define sched_cpu_dying NULL - #endif - -diff --git a/kernel/cpu.c b/kernel/cpu.c -index 2b8d7a5db383..4e11e91010e1 100644 ---- a/kernel/cpu.c -+++ b/kernel/cpu.c -@@ -1606,7 +1606,7 @@ static struct cpuhp_step cpuhp_hp_states[] = { - .name = "ap:online", - }, - /* -- * Handled on controll processor until the plugged processor manages -+ * Handled on control processor until the plugged processor manages - * this itself. - */ - [CPUHP_TEARDOWN_CPU] = { -@@ -1615,6 +1615,13 @@ static struct cpuhp_step cpuhp_hp_states[] = { - .teardown.single = takedown_cpu, - .cant_stop = true, - }, -+ -+ [CPUHP_AP_SCHED_WAIT_EMPTY] = { -+ .name = "sched:waitempty", -+ .startup.single = NULL, -+ .teardown.single = sched_cpu_wait_empty, -+ }, -+ - /* Handle smpboot threads park/unpark */ - [CPUHP_AP_SMPBOOT_THREADS] = { - .name = "smpboot/threads:online", -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index 116f9797a6d0..9ad43e648a78 100644 ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -6739,120 +6739,6 @@ void idle_task_exit(void) - /* finish_cpu(), as ran on the BP, will clean up the active_mm state */ - } - --/* -- * Since this CPU is going 'away' for a while, fold any nr_active delta -- * we might have. Assumes we're called after migrate_tasks() so that the -- * nr_active count is stable. We need to take the teardown thread which -- * is calling this into account, so we hand in adjust = 1 to the load -- * calculation. -- * -- * Also see the comment "Global load-average calculations". -- */ --static void calc_load_migrate(struct rq *rq) --{ -- long delta = calc_load_fold_active(rq, 1); -- if (delta) -- atomic_long_add(delta, &calc_load_tasks); --} -- --static struct task_struct *__pick_migrate_task(struct rq *rq) --{ -- const struct sched_class *class; -- struct task_struct *next; -- -- for_each_class(class) { -- next = class->pick_next_task(rq); -- if (next) { -- next->sched_class->put_prev_task(rq, next); -- return next; -- } -- } -- -- /* The idle class should always have a runnable task */ -- BUG(); --} -- --/* -- * Migrate all tasks from the rq, sleeping tasks will be migrated by -- * try_to_wake_up()->select_task_rq(). -- * -- * Called with rq->lock held even though we'er in stop_machine() and -- * there's no concurrency possible, we hold the required locks anyway -- * because of lock validation efforts. -- */ --static void migrate_tasks(struct rq *dead_rq, struct rq_flags *rf) --{ -- struct rq *rq = dead_rq; -- struct task_struct *next, *stop = rq->stop; -- struct rq_flags orf = *rf; -- int dest_cpu; -- -- /* -- * Fudge the rq selection such that the below task selection loop -- * doesn't get stuck on the currently eligible stop task. -- * -- * We're currently inside stop_machine() and the rq is either stuck -- * in the stop_machine_cpu_stop() loop, or we're executing this code, -- * either way we should never end up calling schedule() until we're -- * done here. -- */ -- rq->stop = NULL; -- -- /* -- * put_prev_task() and pick_next_task() sched -- * class method both need to have an up-to-date -- * value of rq->clock[_task] -- */ -- update_rq_clock(rq); -- -- for (;;) { -- /* -- * There's this thread running, bail when that's the only -- * remaining thread: -- */ -- if (rq->nr_running == 1) -- break; -- -- next = __pick_migrate_task(rq); -- -- /* -- * Rules for changing task_struct::cpus_mask are holding -- * both pi_lock and rq->lock, such that holding either -- * stabilizes the mask. -- * -- * Drop rq->lock is not quite as disastrous as it usually is -- * because !cpu_active at this point, which means load-balance -- * will not interfere. Also, stop-machine. -- */ -- rq_unlock(rq, rf); -- raw_spin_lock(&next->pi_lock); -- rq_relock(rq, rf); -- -- /* -- * Since we're inside stop-machine, _nothing_ should have -- * changed the task, WARN if weird stuff happened, because in -- * that case the above rq->lock drop is a fail too. -- */ -- if (WARN_ON(task_rq(next) != rq || !task_on_rq_queued(next))) { -- raw_spin_unlock(&next->pi_lock); -- continue; -- } -- -- /* Find suitable destination for @next, with force if needed. */ -- dest_cpu = select_fallback_rq(dead_rq->cpu, next); -- rq = __migrate_task(rq, rf, next, dest_cpu); -- if (rq != dead_rq) { -- rq_unlock(rq, rf); -- rq = dead_rq; -- *rf = orf; -- rq_relock(rq, rf); -- } -- raw_spin_unlock(&next->pi_lock); -- } -- -- rq->stop = stop; --} -- - static int __balance_push_cpu_stop(void *arg) - { - struct task_struct *p = arg; -@@ -7121,10 +7007,6 @@ int sched_cpu_deactivate(unsigned int cpu) - return ret; - } - sched_domains_numa_masks_clear(cpu); -- -- /* Wait for all non per CPU kernel threads to vanish. */ -- balance_hotplug_wait(); -- - return 0; - } - -@@ -7144,6 +7026,41 @@ int sched_cpu_starting(unsigned int cpu) - } - - #ifdef CONFIG_HOTPLUG_CPU -+ -+/* -+ * Invoked immediately before the stopper thread is invoked to bring the -+ * CPU down completely. At this point all per CPU kthreads except the -+ * hotplug thread (current) and the stopper thread (inactive) have been -+ * either parked or have been unbound from the outgoing CPU. Ensure that -+ * any of those which might be on the way out are gone. -+ * -+ * If after this point a bound task is being woken on this CPU then the -+ * responsible hotplug callback has failed to do it's job. -+ * sched_cpu_dying() will catch it with the appropriate fireworks. -+ */ -+int sched_cpu_wait_empty(unsigned int cpu) -+{ -+ balance_hotplug_wait(); -+ return 0; -+} -+ -+/* -+ * Since this CPU is going 'away' for a while, fold any nr_active delta we -+ * might have. Called from the CPU stopper task after ensuring that the -+ * stopper is the last running task on the CPU, so nr_active count is -+ * stable. We need to take the teardown thread which is calling this into -+ * account, so we hand in adjust = 1 to the load calculation. -+ * -+ * Also see the comment "Global load-average calculations". -+ */ -+static void calc_load_migrate(struct rq *rq) -+{ -+ long delta = calc_load_fold_active(rq, 1); -+ -+ if (delta) -+ atomic_long_add(delta, &calc_load_tasks); -+} -+ - int sched_cpu_dying(unsigned int cpu) - { - struct rq *rq = cpu_rq(cpu); -@@ -7157,7 +7074,6 @@ int sched_cpu_dying(unsigned int cpu) - BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span)); - set_rq_offline(rq); - } -- migrate_tasks(rq, &rf); - BUG_ON(rq->nr_running != 1); - rq_unlock_irqrestore(rq, &rf); - --- -2.30.2 - diff --git a/debian/patches-rt/0007_sched_make_cond_resched_lock_variants_rt_aware.patch b/debian/patches-rt/0007_sched_make_cond_resched_lock_variants_rt_aware.patch new file mode 100644 index 000000000..e4fab4b9a --- /dev/null +++ b/debian/patches-rt/0007_sched_make_cond_resched_lock_variants_rt_aware.patch @@ -0,0 +1,92 @@ +From: Thomas Gleixner <tglx@linutronix.de> +Subject: sched: Make cond_resched_lock() variants RT aware +Date: Thu, 23 Sep 2021 18:54:44 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The __might_resched() checks in the cond_resched_lock() variants use +PREEMPT_LOCK_OFFSET for preempt count offset checking which takes the +preemption disable by the spin_lock() which is still held at that point +into account. + +On PREEMPT_RT enabled kernels spin/rw_lock held sections stay preemptible +which means PREEMPT_LOCK_OFFSET is 0, but that still triggers the +__might_resched() check because that takes RCU read side nesting into +account. + +On RT enabled kernels spin/read/write_lock() issue rcu_read_lock() to +resemble the !RT semantics, which means in cond_resched_lock() the might +resched check will see preempt_count() == 0 and rcu_preempt_depth() == 1. + +Introduce PREEMPT_LOCK_SCHED_OFFSET for those might resched checks and map +them depending on CONFIG_PREEMPT_RT. + +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20210923165358.305969211@linutronix.de +--- + include/linux/preempt.h | 5 +++-- + include/linux/sched.h | 34 +++++++++++++++++++++++++--------- + 2 files changed, 28 insertions(+), 11 deletions(-) + +--- a/include/linux/preempt.h ++++ b/include/linux/preempt.h +@@ -122,9 +122,10 @@ + * The preempt_count offset after spin_lock() + */ + #if !defined(CONFIG_PREEMPT_RT) +-#define PREEMPT_LOCK_OFFSET PREEMPT_DISABLE_OFFSET ++#define PREEMPT_LOCK_OFFSET PREEMPT_DISABLE_OFFSET + #else +-#define PREEMPT_LOCK_OFFSET 0 ++/* Locks on RT do not disable preemption */ ++#define PREEMPT_LOCK_OFFSET 0 + #endif + + /* +--- a/include/linux/sched.h ++++ b/include/linux/sched.h +@@ -2060,19 +2060,35 @@ extern int __cond_resched_rwlock_write(r + #define MIGHT_RESCHED_RCU_SHIFT 8 + #define MIGHT_RESCHED_PREEMPT_MASK ((1U << MIGHT_RESCHED_RCU_SHIFT) - 1) + +-#define cond_resched_lock(lock) ({ \ +- __might_resched(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET); \ +- __cond_resched_lock(lock); \ ++#ifndef CONFIG_PREEMPT_RT ++/* ++ * Non RT kernels have an elevated preempt count due to the held lock, ++ * but are not allowed to be inside a RCU read side critical section ++ */ ++# define PREEMPT_LOCK_RESCHED_OFFSETS PREEMPT_LOCK_OFFSET ++#else ++/* ++ * spin/rw_lock() on RT implies rcu_read_lock(). The might_sleep() check in ++ * cond_resched*lock() has to take that into account because it checks for ++ * preempt_count() and rcu_preempt_depth(). ++ */ ++# define PREEMPT_LOCK_RESCHED_OFFSETS \ ++ (PREEMPT_LOCK_OFFSET + (1U << MIGHT_RESCHED_RCU_SHIFT)) ++#endif ++ ++#define cond_resched_lock(lock) ({ \ ++ __might_resched(__FILE__, __LINE__, PREEMPT_LOCK_RESCHED_OFFSETS); \ ++ __cond_resched_lock(lock); \ + }) + +-#define cond_resched_rwlock_read(lock) ({ \ +- __might_resched(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET); \ +- __cond_resched_rwlock_read(lock); \ ++#define cond_resched_rwlock_read(lock) ({ \ ++ __might_resched(__FILE__, __LINE__, PREEMPT_LOCK_RESCHED_OFFSETS); \ ++ __cond_resched_rwlock_read(lock); \ + }) + +-#define cond_resched_rwlock_write(lock) ({ \ +- __might_resched(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET); \ +- __cond_resched_rwlock_write(lock); \ ++#define cond_resched_rwlock_write(lock) ({ \ ++ __might_resched(__FILE__, __LINE__, PREEMPT_LOCK_RESCHED_OFFSETS); \ ++ __cond_resched_rwlock_write(lock); \ + }) + + static inline void cond_resched_rcu(void) diff --git a/debian/patches-rt/0008-drm-i915-gt-Queue-and-wait-for-the-irq_work-item.patch b/debian/patches-rt/0008-drm-i915-gt-Queue-and-wait-for-the-irq_work-item.patch new file mode 100644 index 000000000..7f85c293b --- /dev/null +++ b/debian/patches-rt/0008-drm-i915-gt-Queue-and-wait-for-the-irq_work-item.patch @@ -0,0 +1,42 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Wed, 8 Sep 2021 17:18:00 +0200 +Subject: [PATCH 08/10] drm/i915/gt: Queue and wait for the irq_work item. +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +Disabling interrupts and invoking the irq_work function directly breaks +on PREEMPT_RT. +PREEMPT_RT does not invoke all irq_work from hardirq context because +some of the user have spinlock_t locking in the callback function. +These locks are then turned into a sleeping locks which can not be +acquired with disabled interrupts. + +Using irq_work_queue() has the benefit that the irqwork will be invoked +in the regular context. In general there is "no" delay between enqueuing +the callback and its invocation because the interrupt is raised right +away on architectures which support it (which includes x86). + +Use irq_work_queue() + irq_work_sync() instead invoking the callback +directly. + +Reported-by: Clark Williams <williams@redhat.com> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> +--- + drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 5 ++--- + 1 file changed, 2 insertions(+), 3 deletions(-) + +--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c ++++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c +@@ -311,10 +311,9 @@ void __intel_breadcrumbs_park(struct int + /* Kick the work once more to drain the signalers, and disarm the irq */ + irq_work_sync(&b->irq_work); + while (READ_ONCE(b->irq_armed) && !atomic_read(&b->active)) { +- local_irq_disable(); +- signal_irq_work(&b->irq_work); +- local_irq_enable(); ++ irq_work_queue(&b->irq_work); + cond_resched(); ++ irq_work_sync(&b->irq_work); + } + } + diff --git a/debian/patches-rt/0008-lockdep-selftests-Skip-the-softirq-related-tests-on-.patch b/debian/patches-rt/0008-lockdep-selftests-Skip-the-softirq-related-tests-on-.patch new file mode 100644 index 000000000..21137ae06 --- /dev/null +++ b/debian/patches-rt/0008-lockdep-selftests-Skip-the-softirq-related-tests-on-.patch @@ -0,0 +1,218 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Thu, 12 Aug 2021 16:02:29 +0200 +Subject: [PATCH 08/10] lockdep/selftests: Skip the softirq related tests on + PREEMPT_RT +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The softirq context on PREEMPT_RT is different compared to !PREEMPT_RT. +As such lockdep_softirq_enter() is a nop and the all the "softirq safe" +tests fail on PREEMPT_RT because there is no difference. + +Skip the softirq context tests on PREEMPT_RT. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + lib/locking-selftest.c | 38 +++++++++++++++++++++++++++++++------- + 1 file changed, 31 insertions(+), 7 deletions(-) + +--- a/lib/locking-selftest.c ++++ b/lib/locking-selftest.c +@@ -26,6 +26,12 @@ + #include <linux/rtmutex.h> + #include <linux/local_lock.h> + ++#ifdef CONFIG_PREEMPT_RT ++# define NON_RT(...) ++#else ++# define NON_RT(...) __VA_ARGS__ ++#endif ++ + /* + * Change this to 1 if you want to see the failure printouts: + */ +@@ -808,6 +814,7 @@ GENERATE_PERMUTATIONS_2_EVENTS(irqsafe1_ + #include "locking-selftest-wlock-hardirq.h" + GENERATE_PERMUTATIONS_2_EVENTS(irqsafe1_hard_wlock) + ++#ifndef CONFIG_PREEMPT_RT + #include "locking-selftest-spin-softirq.h" + GENERATE_PERMUTATIONS_2_EVENTS(irqsafe1_soft_spin) + +@@ -816,10 +823,12 @@ GENERATE_PERMUTATIONS_2_EVENTS(irqsafe1_ + + #include "locking-selftest-wlock-softirq.h" + GENERATE_PERMUTATIONS_2_EVENTS(irqsafe1_soft_wlock) ++#endif + + #undef E1 + #undef E2 + ++#ifndef CONFIG_PREEMPT_RT + /* + * Enabling hardirqs with a softirq-safe lock held: + */ +@@ -852,6 +861,8 @@ GENERATE_PERMUTATIONS_2_EVENTS(irqsafe2A + #undef E1 + #undef E2 + ++#endif ++ + /* + * Enabling irqs with an irq-safe lock held: + */ +@@ -881,6 +892,7 @@ GENERATE_PERMUTATIONS_2_EVENTS(irqsafe2B + #include "locking-selftest-wlock-hardirq.h" + GENERATE_PERMUTATIONS_2_EVENTS(irqsafe2B_hard_wlock) + ++#ifndef CONFIG_PREEMPT_RT + #include "locking-selftest-spin-softirq.h" + GENERATE_PERMUTATIONS_2_EVENTS(irqsafe2B_soft_spin) + +@@ -889,6 +901,7 @@ GENERATE_PERMUTATIONS_2_EVENTS(irqsafe2B + + #include "locking-selftest-wlock-softirq.h" + GENERATE_PERMUTATIONS_2_EVENTS(irqsafe2B_soft_wlock) ++#endif + + #undef E1 + #undef E2 +@@ -927,6 +940,7 @@ GENERATE_PERMUTATIONS_3_EVENTS(irqsafe3_ + #include "locking-selftest-wlock-hardirq.h" + GENERATE_PERMUTATIONS_3_EVENTS(irqsafe3_hard_wlock) + ++#ifndef CONFIG_PREEMPT_RT + #include "locking-selftest-spin-softirq.h" + GENERATE_PERMUTATIONS_3_EVENTS(irqsafe3_soft_spin) + +@@ -935,6 +949,7 @@ GENERATE_PERMUTATIONS_3_EVENTS(irqsafe3_ + + #include "locking-selftest-wlock-softirq.h" + GENERATE_PERMUTATIONS_3_EVENTS(irqsafe3_soft_wlock) ++#endif + + #undef E1 + #undef E2 +@@ -975,6 +990,7 @@ GENERATE_PERMUTATIONS_3_EVENTS(irqsafe4_ + #include "locking-selftest-wlock-hardirq.h" + GENERATE_PERMUTATIONS_3_EVENTS(irqsafe4_hard_wlock) + ++#ifndef CONFIG_PREEMPT_RT + #include "locking-selftest-spin-softirq.h" + GENERATE_PERMUTATIONS_3_EVENTS(irqsafe4_soft_spin) + +@@ -983,6 +999,7 @@ GENERATE_PERMUTATIONS_3_EVENTS(irqsafe4_ + + #include "locking-selftest-wlock-softirq.h" + GENERATE_PERMUTATIONS_3_EVENTS(irqsafe4_soft_wlock) ++#endif + + #undef E1 + #undef E2 +@@ -1037,6 +1054,7 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_inver + #include "locking-selftest-wlock-hardirq.h" + GENERATE_PERMUTATIONS_3_EVENTS(irq_inversion_hard_wlock) + ++#ifndef CONFIG_PREEMPT_RT + #include "locking-selftest-spin-softirq.h" + GENERATE_PERMUTATIONS_3_EVENTS(irq_inversion_soft_spin) + +@@ -1045,6 +1063,7 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_inver + + #include "locking-selftest-wlock-softirq.h" + GENERATE_PERMUTATIONS_3_EVENTS(irq_inversion_soft_wlock) ++#endif + + #undef E1 + #undef E2 +@@ -1212,12 +1231,14 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_ + #include "locking-selftest-wlock.h" + GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_hard_wlock) + ++#ifndef CONFIG_PREEMPT_RT + #include "locking-selftest-softirq.h" + #include "locking-selftest-rlock.h" + GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft_rlock) + + #include "locking-selftest-wlock.h" + GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft_wlock) ++#endif + + #undef E1 + #undef E2 +@@ -1258,12 +1279,14 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_ + #include "locking-selftest-wlock.h" + GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion2_hard_wlock) + ++#ifndef CONFIG_PREEMPT_RT + #include "locking-selftest-softirq.h" + #include "locking-selftest-rlock.h" + GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion2_soft_rlock) + + #include "locking-selftest-wlock.h" + GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion2_soft_wlock) ++#endif + + #undef E1 + #undef E2 +@@ -1312,12 +1335,14 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_ + #include "locking-selftest-wlock.h" + GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion3_hard_wlock) + ++#ifndef CONFIG_PREEMPT_RT + #include "locking-selftest-softirq.h" + #include "locking-selftest-rlock.h" + GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion3_soft_rlock) + + #include "locking-selftest-wlock.h" + GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion3_soft_wlock) ++#endif + + #ifdef CONFIG_DEBUG_LOCK_ALLOC + # define I_SPINLOCK(x) lockdep_reset_lock(&lock_##x.dep_map) +@@ -1523,7 +1548,7 @@ static inline void print_testname(const + + #define DO_TESTCASE_2x2RW(desc, name, nr) \ + DO_TESTCASE_2RW("hard-"desc, name##_hard, nr) \ +- DO_TESTCASE_2RW("soft-"desc, name##_soft, nr) \ ++ NON_RT(DO_TESTCASE_2RW("soft-"desc, name##_soft, nr)) \ + + #define DO_TESTCASE_6x2x2RW(desc, name) \ + DO_TESTCASE_2x2RW(desc, name, 123); \ +@@ -1571,19 +1596,19 @@ static inline void print_testname(const + + #define DO_TESTCASE_2I(desc, name, nr) \ + DO_TESTCASE_1("hard-"desc, name##_hard, nr); \ +- DO_TESTCASE_1("soft-"desc, name##_soft, nr); ++ NON_RT(DO_TESTCASE_1("soft-"desc, name##_soft, nr)); + + #define DO_TESTCASE_2IB(desc, name, nr) \ + DO_TESTCASE_1B("hard-"desc, name##_hard, nr); \ +- DO_TESTCASE_1B("soft-"desc, name##_soft, nr); ++ NON_RT(DO_TESTCASE_1B("soft-"desc, name##_soft, nr)); + + #define DO_TESTCASE_6I(desc, name, nr) \ + DO_TESTCASE_3("hard-"desc, name##_hard, nr); \ +- DO_TESTCASE_3("soft-"desc, name##_soft, nr); ++ NON_RT(DO_TESTCASE_3("soft-"desc, name##_soft, nr)); + + #define DO_TESTCASE_6IRW(desc, name, nr) \ + DO_TESTCASE_3RW("hard-"desc, name##_hard, nr); \ +- DO_TESTCASE_3RW("soft-"desc, name##_soft, nr); ++ NON_RT(DO_TESTCASE_3RW("soft-"desc, name##_soft, nr)); + + #define DO_TESTCASE_2x3(desc, name) \ + DO_TESTCASE_3(desc, name, 12); \ +@@ -2909,12 +2934,11 @@ void locking_selftest(void) + DO_TESTCASE_6x1RR("rlock W1R2/R2R3/W3W1", W1R2_R2R3_W3W1); + + printk(" --------------------------------------------------------------------------\n"); +- + /* + * irq-context testcases: + */ + DO_TESTCASE_2x6("irqs-on + irq-safe-A", irqsafe1); +- DO_TESTCASE_2x3("sirq-safe-A => hirqs-on", irqsafe2A); ++ NON_RT(DO_TESTCASE_2x3("sirq-safe-A => hirqs-on", irqsafe2A)); + DO_TESTCASE_2x6("safe-A + irqs-on", irqsafe2B); + DO_TESTCASE_6x6("safe-A + unsafe-B #1", irqsafe3); + DO_TESTCASE_6x6("safe-A + unsafe-B #2", irqsafe4); diff --git a/debian/patches-rt/0008-net-sched-Merge-Qdisc-bstats-and-Qdisc-cpu_bstats-da.patch b/debian/patches-rt/0008-net-sched-Merge-Qdisc-bstats-and-Qdisc-cpu_bstats-da.patch new file mode 100644 index 000000000..cd2c0b496 --- /dev/null +++ b/debian/patches-rt/0008-net-sched-Merge-Qdisc-bstats-and-Qdisc-cpu_bstats-da.patch @@ -0,0 +1,995 @@ +From: "Ahmed S. Darwish" <a.darwish@linutronix.de> +Date: Sat, 16 Oct 2021 10:49:09 +0200 +Subject: [PATCH 8/9] net: sched: Merge Qdisc::bstats and Qdisc::cpu_bstats + data types +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The only factor differentiating per-CPU bstats data type (struct +gnet_stats_basic_cpu) from the packed non-per-CPU one (struct +gnet_stats_basic_packed) was a u64_stats sync point inside the former. +The two data types are now equivalent: earlier commits added a u64_stats +sync point to the latter. + +Combine both data types into "struct gnet_stats_basic_sync". This +eliminates redundancy and simplifies the bstats read/write APIs. + +Use u64_stats_t for bstats "packets" and "bytes" data types. On 64-bit +architectures, u64_stats sync points do not use sequence counter +protection. + +Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: David S. Miller <davem@davemloft.net> +--- + drivers/net/ethernet/netronome/nfp/abm/qdisc.c | 2 + include/net/act_api.h | 10 ++-- + include/net/gen_stats.h | 44 +++++++++-------- + include/net/netfilter/xt_rateest.h | 2 + include/net/pkt_cls.h | 4 - + include/net/sch_generic.h | 34 +++---------- + net/core/gen_estimator.c | 36 ++++++++------ + net/core/gen_stats.c | 62 +++++++++++++------------ + net/netfilter/xt_RATEEST.c | 8 +-- + net/sched/act_api.c | 14 ++--- + net/sched/act_bpf.c | 2 + net/sched/act_ife.c | 4 - + net/sched/act_mpls.c | 2 + net/sched/act_police.c | 2 + net/sched/act_sample.c | 2 + net/sched/act_simple.c | 3 - + net/sched/act_skbedit.c | 2 + net/sched/act_skbmod.c | 2 + net/sched/sch_api.c | 2 + net/sched/sch_atm.c | 4 - + net/sched/sch_cbq.c | 4 - + net/sched/sch_drr.c | 4 - + net/sched/sch_ets.c | 4 - + net/sched/sch_generic.c | 4 - + net/sched/sch_gred.c | 10 ++-- + net/sched/sch_hfsc.c | 4 - + net/sched/sch_htb.c | 32 ++++++------ + net/sched/sch_mq.c | 2 + net/sched/sch_mqprio.c | 6 +- + net/sched/sch_qfq.c | 4 - + 30 files changed, 155 insertions(+), 160 deletions(-) + +--- a/drivers/net/ethernet/netronome/nfp/abm/qdisc.c ++++ b/drivers/net/ethernet/netronome/nfp/abm/qdisc.c +@@ -458,7 +458,7 @@ nfp_abm_qdisc_graft(struct nfp_abm_link + static void + nfp_abm_stats_calculate(struct nfp_alink_stats *new, + struct nfp_alink_stats *old, +- struct gnet_stats_basic_packed *bstats, ++ struct gnet_stats_basic_sync *bstats, + struct gnet_stats_queue *qstats) + { + _bstats_update(bstats, new->tx_bytes - old->tx_bytes, +--- a/include/net/act_api.h ++++ b/include/net/act_api.h +@@ -30,13 +30,13 @@ struct tc_action { + atomic_t tcfa_bindcnt; + int tcfa_action; + struct tcf_t tcfa_tm; +- struct gnet_stats_basic_packed tcfa_bstats; +- struct gnet_stats_basic_packed tcfa_bstats_hw; ++ struct gnet_stats_basic_sync tcfa_bstats; ++ struct gnet_stats_basic_sync tcfa_bstats_hw; + struct gnet_stats_queue tcfa_qstats; + struct net_rate_estimator __rcu *tcfa_rate_est; + spinlock_t tcfa_lock; +- struct gnet_stats_basic_cpu __percpu *cpu_bstats; +- struct gnet_stats_basic_cpu __percpu *cpu_bstats_hw; ++ struct gnet_stats_basic_sync __percpu *cpu_bstats; ++ struct gnet_stats_basic_sync __percpu *cpu_bstats_hw; + struct gnet_stats_queue __percpu *cpu_qstats; + struct tc_cookie __rcu *act_cookie; + struct tcf_chain __rcu *goto_chain; +@@ -206,7 +206,7 @@ static inline void tcf_action_update_bst + struct sk_buff *skb) + { + if (likely(a->cpu_bstats)) { +- bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), skb); ++ bstats_update(this_cpu_ptr(a->cpu_bstats), skb); + return; + } + spin_lock(&a->tcfa_lock); +--- a/include/net/gen_stats.h ++++ b/include/net/gen_stats.h +@@ -7,15 +7,17 @@ + #include <linux/rtnetlink.h> + #include <linux/pkt_sched.h> + +-/* Note: this used to be in include/uapi/linux/gen_stats.h */ +-struct gnet_stats_basic_packed { +- __u64 bytes; +- __u64 packets; +- struct u64_stats_sync syncp; +-}; +- +-struct gnet_stats_basic_cpu { +- struct gnet_stats_basic_packed bstats; ++/* Throughput stats. ++ * Must be initialized beforehand with gnet_stats_basic_sync_init(). ++ * ++ * If no reads can ever occur parallel to writes (e.g. stack-allocated ++ * bstats), then the internal stat values can be written to and read ++ * from directly. Otherwise, use _bstats_set/update() for writes and ++ * gnet_stats_add_basic() for reads. ++ */ ++struct gnet_stats_basic_sync { ++ u64_stats_t bytes; ++ u64_stats_t packets; + struct u64_stats_sync syncp; + } __aligned(2 * sizeof(u64)); + +@@ -35,7 +37,7 @@ struct gnet_dump { + struct tc_stats tc_stats; + }; + +-void gnet_stats_basic_packed_init(struct gnet_stats_basic_packed *b); ++void gnet_stats_basic_sync_init(struct gnet_stats_basic_sync *b); + int gnet_stats_start_copy(struct sk_buff *skb, int type, spinlock_t *lock, + struct gnet_dump *d, int padattr); + +@@ -46,16 +48,16 @@ int gnet_stats_start_copy_compat(struct + + int gnet_stats_copy_basic(const seqcount_t *running, + struct gnet_dump *d, +- struct gnet_stats_basic_cpu __percpu *cpu, +- struct gnet_stats_basic_packed *b); ++ struct gnet_stats_basic_sync __percpu *cpu, ++ struct gnet_stats_basic_sync *b); + void gnet_stats_add_basic(const seqcount_t *running, +- struct gnet_stats_basic_packed *bstats, +- struct gnet_stats_basic_cpu __percpu *cpu, +- struct gnet_stats_basic_packed *b); ++ struct gnet_stats_basic_sync *bstats, ++ struct gnet_stats_basic_sync __percpu *cpu, ++ struct gnet_stats_basic_sync *b); + int gnet_stats_copy_basic_hw(const seqcount_t *running, + struct gnet_dump *d, +- struct gnet_stats_basic_cpu __percpu *cpu, +- struct gnet_stats_basic_packed *b); ++ struct gnet_stats_basic_sync __percpu *cpu, ++ struct gnet_stats_basic_sync *b); + int gnet_stats_copy_rate_est(struct gnet_dump *d, + struct net_rate_estimator __rcu **ptr); + int gnet_stats_copy_queue(struct gnet_dump *d, +@@ -68,14 +70,14 @@ int gnet_stats_copy_app(struct gnet_dump + + int gnet_stats_finish_copy(struct gnet_dump *d); + +-int gen_new_estimator(struct gnet_stats_basic_packed *bstats, +- struct gnet_stats_basic_cpu __percpu *cpu_bstats, ++int gen_new_estimator(struct gnet_stats_basic_sync *bstats, ++ struct gnet_stats_basic_sync __percpu *cpu_bstats, + struct net_rate_estimator __rcu **rate_est, + spinlock_t *lock, + seqcount_t *running, struct nlattr *opt); + void gen_kill_estimator(struct net_rate_estimator __rcu **ptr); +-int gen_replace_estimator(struct gnet_stats_basic_packed *bstats, +- struct gnet_stats_basic_cpu __percpu *cpu_bstats, ++int gen_replace_estimator(struct gnet_stats_basic_sync *bstats, ++ struct gnet_stats_basic_sync __percpu *cpu_bstats, + struct net_rate_estimator __rcu **ptr, + spinlock_t *lock, + seqcount_t *running, struct nlattr *opt); +--- a/include/net/netfilter/xt_rateest.h ++++ b/include/net/netfilter/xt_rateest.h +@@ -6,7 +6,7 @@ + + struct xt_rateest { + /* keep lock and bstats on same cache line to speedup xt_rateest_tg() */ +- struct gnet_stats_basic_packed bstats; ++ struct gnet_stats_basic_sync bstats; + spinlock_t lock; + + +--- a/include/net/pkt_cls.h ++++ b/include/net/pkt_cls.h +@@ -765,7 +765,7 @@ struct tc_cookie { + }; + + struct tc_qopt_offload_stats { +- struct gnet_stats_basic_packed *bstats; ++ struct gnet_stats_basic_sync *bstats; + struct gnet_stats_queue *qstats; + }; + +@@ -885,7 +885,7 @@ struct tc_gred_qopt_offload_params { + }; + + struct tc_gred_qopt_offload_stats { +- struct gnet_stats_basic_packed bstats[MAX_DPs]; ++ struct gnet_stats_basic_sync bstats[MAX_DPs]; + struct gnet_stats_queue qstats[MAX_DPs]; + struct red_stats *xstats[MAX_DPs]; + }; +--- a/include/net/sch_generic.h ++++ b/include/net/sch_generic.h +@@ -97,7 +97,7 @@ struct Qdisc { + struct netdev_queue *dev_queue; + + struct net_rate_estimator __rcu *rate_est; +- struct gnet_stats_basic_cpu __percpu *cpu_bstats; ++ struct gnet_stats_basic_sync __percpu *cpu_bstats; + struct gnet_stats_queue __percpu *cpu_qstats; + int pad; + refcount_t refcnt; +@@ -107,7 +107,7 @@ struct Qdisc { + */ + struct sk_buff_head gso_skb ____cacheline_aligned_in_smp; + struct qdisc_skb_head q; +- struct gnet_stats_basic_packed bstats; ++ struct gnet_stats_basic_sync bstats; + seqcount_t running; + struct gnet_stats_queue qstats; + unsigned long state; +@@ -845,16 +845,16 @@ static inline int qdisc_enqueue(struct s + return sch->enqueue(skb, sch, to_free); + } + +-static inline void _bstats_update(struct gnet_stats_basic_packed *bstats, ++static inline void _bstats_update(struct gnet_stats_basic_sync *bstats, + __u64 bytes, __u32 packets) + { + u64_stats_update_begin(&bstats->syncp); +- bstats->bytes += bytes; +- bstats->packets += packets; ++ u64_stats_add(&bstats->bytes, bytes); ++ u64_stats_add(&bstats->packets, packets); + u64_stats_update_end(&bstats->syncp); + } + +-static inline void bstats_update(struct gnet_stats_basic_packed *bstats, ++static inline void bstats_update(struct gnet_stats_basic_sync *bstats, + const struct sk_buff *skb) + { + _bstats_update(bstats, +@@ -862,26 +862,10 @@ static inline void bstats_update(struct + skb_is_gso(skb) ? skb_shinfo(skb)->gso_segs : 1); + } + +-static inline void _bstats_cpu_update(struct gnet_stats_basic_cpu *bstats, +- __u64 bytes, __u32 packets) +-{ +- u64_stats_update_begin(&bstats->syncp); +- _bstats_update(&bstats->bstats, bytes, packets); +- u64_stats_update_end(&bstats->syncp); +-} +- +-static inline void bstats_cpu_update(struct gnet_stats_basic_cpu *bstats, +- const struct sk_buff *skb) +-{ +- u64_stats_update_begin(&bstats->syncp); +- bstats_update(&bstats->bstats, skb); +- u64_stats_update_end(&bstats->syncp); +-} +- + static inline void qdisc_bstats_cpu_update(struct Qdisc *sch, + const struct sk_buff *skb) + { +- bstats_cpu_update(this_cpu_ptr(sch->cpu_bstats), skb); ++ bstats_update(this_cpu_ptr(sch->cpu_bstats), skb); + } + + static inline void qdisc_bstats_update(struct Qdisc *sch, +@@ -1313,7 +1297,7 @@ void psched_ppscfg_precompute(struct psc + struct mini_Qdisc { + struct tcf_proto *filter_list; + struct tcf_block *block; +- struct gnet_stats_basic_cpu __percpu *cpu_bstats; ++ struct gnet_stats_basic_sync __percpu *cpu_bstats; + struct gnet_stats_queue __percpu *cpu_qstats; + struct rcu_head rcu; + }; +@@ -1321,7 +1305,7 @@ struct mini_Qdisc { + static inline void mini_qdisc_bstats_cpu_update(struct mini_Qdisc *miniq, + const struct sk_buff *skb) + { +- bstats_cpu_update(this_cpu_ptr(miniq->cpu_bstats), skb); ++ bstats_update(this_cpu_ptr(miniq->cpu_bstats), skb); + } + + static inline void mini_qdisc_qstats_cpu_drop(struct mini_Qdisc *miniq) +--- a/net/core/gen_estimator.c ++++ b/net/core/gen_estimator.c +@@ -40,10 +40,10 @@ + */ + + struct net_rate_estimator { +- struct gnet_stats_basic_packed *bstats; ++ struct gnet_stats_basic_sync *bstats; + spinlock_t *stats_lock; + seqcount_t *running; +- struct gnet_stats_basic_cpu __percpu *cpu_bstats; ++ struct gnet_stats_basic_sync __percpu *cpu_bstats; + u8 ewma_log; + u8 intvl_log; /* period : (250ms << intvl_log) */ + +@@ -60,9 +60,9 @@ struct net_rate_estimator { + }; + + static void est_fetch_counters(struct net_rate_estimator *e, +- struct gnet_stats_basic_packed *b) ++ struct gnet_stats_basic_sync *b) + { +- gnet_stats_basic_packed_init(b); ++ gnet_stats_basic_sync_init(b); + if (e->stats_lock) + spin_lock(e->stats_lock); + +@@ -76,14 +76,18 @@ static void est_fetch_counters(struct ne + static void est_timer(struct timer_list *t) + { + struct net_rate_estimator *est = from_timer(est, t, timer); +- struct gnet_stats_basic_packed b; ++ struct gnet_stats_basic_sync b; ++ u64 b_bytes, b_packets; + u64 rate, brate; + + est_fetch_counters(est, &b); +- brate = (b.bytes - est->last_bytes) << (10 - est->intvl_log); ++ b_bytes = u64_stats_read(&b.bytes); ++ b_packets = u64_stats_read(&b.packets); ++ ++ brate = (b_bytes - est->last_bytes) << (10 - est->intvl_log); + brate = (brate >> est->ewma_log) - (est->avbps >> est->ewma_log); + +- rate = (b.packets - est->last_packets) << (10 - est->intvl_log); ++ rate = (b_packets - est->last_packets) << (10 - est->intvl_log); + rate = (rate >> est->ewma_log) - (est->avpps >> est->ewma_log); + + write_seqcount_begin(&est->seq); +@@ -91,8 +95,8 @@ static void est_timer(struct timer_list + est->avpps += rate; + write_seqcount_end(&est->seq); + +- est->last_bytes = b.bytes; +- est->last_packets = b.packets; ++ est->last_bytes = b_bytes; ++ est->last_packets = b_packets; + + est->next_jiffies += ((HZ/4) << est->intvl_log); + +@@ -121,8 +125,8 @@ static void est_timer(struct timer_list + * Returns 0 on success or a negative error code. + * + */ +-int gen_new_estimator(struct gnet_stats_basic_packed *bstats, +- struct gnet_stats_basic_cpu __percpu *cpu_bstats, ++int gen_new_estimator(struct gnet_stats_basic_sync *bstats, ++ struct gnet_stats_basic_sync __percpu *cpu_bstats, + struct net_rate_estimator __rcu **rate_est, + spinlock_t *lock, + seqcount_t *running, +@@ -130,7 +134,7 @@ int gen_new_estimator(struct gnet_stats_ + { + struct gnet_estimator *parm = nla_data(opt); + struct net_rate_estimator *old, *est; +- struct gnet_stats_basic_packed b; ++ struct gnet_stats_basic_sync b; + int intvl_log; + + if (nla_len(opt) < sizeof(*parm)) +@@ -164,8 +168,8 @@ int gen_new_estimator(struct gnet_stats_ + est_fetch_counters(est, &b); + if (lock) + local_bh_enable(); +- est->last_bytes = b.bytes; +- est->last_packets = b.packets; ++ est->last_bytes = u64_stats_read(&b.bytes); ++ est->last_packets = u64_stats_read(&b.packets); + + if (lock) + spin_lock_bh(lock); +@@ -222,8 +226,8 @@ EXPORT_SYMBOL(gen_kill_estimator); + * + * Returns 0 on success or a negative error code. + */ +-int gen_replace_estimator(struct gnet_stats_basic_packed *bstats, +- struct gnet_stats_basic_cpu __percpu *cpu_bstats, ++int gen_replace_estimator(struct gnet_stats_basic_sync *bstats, ++ struct gnet_stats_basic_sync __percpu *cpu_bstats, + struct net_rate_estimator __rcu **rate_est, + spinlock_t *lock, + seqcount_t *running, struct nlattr *opt) +--- a/net/core/gen_stats.c ++++ b/net/core/gen_stats.c +@@ -115,29 +115,29 @@ gnet_stats_start_copy(struct sk_buff *sk + EXPORT_SYMBOL(gnet_stats_start_copy); + + /* Must not be inlined, due to u64_stats seqcount_t lockdep key */ +-void gnet_stats_basic_packed_init(struct gnet_stats_basic_packed *b) ++void gnet_stats_basic_sync_init(struct gnet_stats_basic_sync *b) + { +- b->bytes = 0; +- b->packets = 0; ++ u64_stats_set(&b->bytes, 0); ++ u64_stats_set(&b->packets, 0); + u64_stats_init(&b->syncp); + } +-EXPORT_SYMBOL(gnet_stats_basic_packed_init); ++EXPORT_SYMBOL(gnet_stats_basic_sync_init); + +-static void gnet_stats_add_basic_cpu(struct gnet_stats_basic_packed *bstats, +- struct gnet_stats_basic_cpu __percpu *cpu) ++static void gnet_stats_add_basic_cpu(struct gnet_stats_basic_sync *bstats, ++ struct gnet_stats_basic_sync __percpu *cpu) + { + u64 t_bytes = 0, t_packets = 0; + int i; + + for_each_possible_cpu(i) { +- struct gnet_stats_basic_cpu *bcpu = per_cpu_ptr(cpu, i); ++ struct gnet_stats_basic_sync *bcpu = per_cpu_ptr(cpu, i); + unsigned int start; + u64 bytes, packets; + + do { + start = u64_stats_fetch_begin_irq(&bcpu->syncp); +- bytes = bcpu->bstats.bytes; +- packets = bcpu->bstats.packets; ++ bytes = u64_stats_read(&bcpu->bytes); ++ packets = u64_stats_read(&bcpu->packets); + } while (u64_stats_fetch_retry_irq(&bcpu->syncp, start)); + + t_bytes += bytes; +@@ -147,9 +147,9 @@ static void gnet_stats_add_basic_cpu(str + } + + void gnet_stats_add_basic(const seqcount_t *running, +- struct gnet_stats_basic_packed *bstats, +- struct gnet_stats_basic_cpu __percpu *cpu, +- struct gnet_stats_basic_packed *b) ++ struct gnet_stats_basic_sync *bstats, ++ struct gnet_stats_basic_sync __percpu *cpu, ++ struct gnet_stats_basic_sync *b) + { + unsigned int seq; + u64 bytes = 0; +@@ -162,8 +162,8 @@ void gnet_stats_add_basic(const seqcount + do { + if (running) + seq = read_seqcount_begin(running); +- bytes = b->bytes; +- packets = b->packets; ++ bytes = u64_stats_read(&b->bytes); ++ packets = u64_stats_read(&b->packets); + } while (running && read_seqcount_retry(running, seq)); + + _bstats_update(bstats, bytes, packets); +@@ -173,18 +173,22 @@ EXPORT_SYMBOL(gnet_stats_add_basic); + static int + ___gnet_stats_copy_basic(const seqcount_t *running, + struct gnet_dump *d, +- struct gnet_stats_basic_cpu __percpu *cpu, +- struct gnet_stats_basic_packed *b, ++ struct gnet_stats_basic_sync __percpu *cpu, ++ struct gnet_stats_basic_sync *b, + int type) + { +- struct gnet_stats_basic_packed bstats; ++ struct gnet_stats_basic_sync bstats; ++ u64 bstats_bytes, bstats_packets; + +- gnet_stats_basic_packed_init(&bstats); ++ gnet_stats_basic_sync_init(&bstats); + gnet_stats_add_basic(running, &bstats, cpu, b); + ++ bstats_bytes = u64_stats_read(&bstats.bytes); ++ bstats_packets = u64_stats_read(&bstats.packets); ++ + if (d->compat_tc_stats && type == TCA_STATS_BASIC) { +- d->tc_stats.bytes = bstats.bytes; +- d->tc_stats.packets = bstats.packets; ++ d->tc_stats.bytes = bstats_bytes; ++ d->tc_stats.packets = bstats_packets; + } + + if (d->tail) { +@@ -192,14 +196,14 @@ static int + int res; + + memset(&sb, 0, sizeof(sb)); +- sb.bytes = bstats.bytes; +- sb.packets = bstats.packets; ++ sb.bytes = bstats_bytes; ++ sb.packets = bstats_packets; + res = gnet_stats_copy(d, type, &sb, sizeof(sb), TCA_STATS_PAD); +- if (res < 0 || sb.packets == bstats.packets) ++ if (res < 0 || sb.packets == bstats_packets) + return res; + /* emit 64bit stats only if needed */ +- return gnet_stats_copy(d, TCA_STATS_PKT64, &bstats.packets, +- sizeof(bstats.packets), TCA_STATS_PAD); ++ return gnet_stats_copy(d, TCA_STATS_PKT64, &bstats_packets, ++ sizeof(bstats_packets), TCA_STATS_PAD); + } + return 0; + } +@@ -220,8 +224,8 @@ static int + int + gnet_stats_copy_basic(const seqcount_t *running, + struct gnet_dump *d, +- struct gnet_stats_basic_cpu __percpu *cpu, +- struct gnet_stats_basic_packed *b) ++ struct gnet_stats_basic_sync __percpu *cpu, ++ struct gnet_stats_basic_sync *b) + { + return ___gnet_stats_copy_basic(running, d, cpu, b, + TCA_STATS_BASIC); +@@ -244,8 +248,8 @@ EXPORT_SYMBOL(gnet_stats_copy_basic); + int + gnet_stats_copy_basic_hw(const seqcount_t *running, + struct gnet_dump *d, +- struct gnet_stats_basic_cpu __percpu *cpu, +- struct gnet_stats_basic_packed *b) ++ struct gnet_stats_basic_sync __percpu *cpu, ++ struct gnet_stats_basic_sync *b) + { + return ___gnet_stats_copy_basic(running, d, cpu, b, + TCA_STATS_BASIC_HW); +--- a/net/netfilter/xt_RATEEST.c ++++ b/net/netfilter/xt_RATEEST.c +@@ -94,11 +94,11 @@ static unsigned int + xt_rateest_tg(struct sk_buff *skb, const struct xt_action_param *par) + { + const struct xt_rateest_target_info *info = par->targinfo; +- struct gnet_stats_basic_packed *stats = &info->est->bstats; ++ struct gnet_stats_basic_sync *stats = &info->est->bstats; + + spin_lock_bh(&info->est->lock); +- stats->bytes += skb->len; +- stats->packets++; ++ u64_stats_add(&stats->bytes, skb->len); ++ u64_stats_inc(&stats->packets); + spin_unlock_bh(&info->est->lock); + + return XT_CONTINUE; +@@ -143,7 +143,7 @@ static int xt_rateest_tg_checkentry(cons + if (!est) + goto err1; + +- gnet_stats_basic_packed_init(&est->bstats); ++ gnet_stats_basic_sync_init(&est->bstats); + strlcpy(est->name, info->name, sizeof(est->name)); + spin_lock_init(&est->lock); + est->refcnt = 1; +--- a/net/sched/act_api.c ++++ b/net/sched/act_api.c +@@ -480,18 +480,18 @@ int tcf_idr_create(struct tc_action_net + atomic_set(&p->tcfa_bindcnt, 1); + + if (cpustats) { +- p->cpu_bstats = netdev_alloc_pcpu_stats(struct gnet_stats_basic_cpu); ++ p->cpu_bstats = netdev_alloc_pcpu_stats(struct gnet_stats_basic_sync); + if (!p->cpu_bstats) + goto err1; +- p->cpu_bstats_hw = netdev_alloc_pcpu_stats(struct gnet_stats_basic_cpu); ++ p->cpu_bstats_hw = netdev_alloc_pcpu_stats(struct gnet_stats_basic_sync); + if (!p->cpu_bstats_hw) + goto err2; + p->cpu_qstats = alloc_percpu(struct gnet_stats_queue); + if (!p->cpu_qstats) + goto err3; + } +- gnet_stats_basic_packed_init(&p->tcfa_bstats); +- gnet_stats_basic_packed_init(&p->tcfa_bstats_hw); ++ gnet_stats_basic_sync_init(&p->tcfa_bstats); ++ gnet_stats_basic_sync_init(&p->tcfa_bstats_hw); + spin_lock_init(&p->tcfa_lock); + p->tcfa_index = index; + p->tcfa_tm.install = jiffies; +@@ -1128,13 +1128,13 @@ void tcf_action_update_stats(struct tc_a + u64 drops, bool hw) + { + if (a->cpu_bstats) { +- _bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), bytes, packets); ++ _bstats_update(this_cpu_ptr(a->cpu_bstats), bytes, packets); + + this_cpu_ptr(a->cpu_qstats)->drops += drops; + + if (hw) +- _bstats_cpu_update(this_cpu_ptr(a->cpu_bstats_hw), +- bytes, packets); ++ _bstats_update(this_cpu_ptr(a->cpu_bstats_hw), ++ bytes, packets); + return; + } + +--- a/net/sched/act_bpf.c ++++ b/net/sched/act_bpf.c +@@ -41,7 +41,7 @@ static int tcf_bpf_act(struct sk_buff *s + int action, filter_res; + + tcf_lastuse_update(&prog->tcf_tm); +- bstats_cpu_update(this_cpu_ptr(prog->common.cpu_bstats), skb); ++ bstats_update(this_cpu_ptr(prog->common.cpu_bstats), skb); + + filter = rcu_dereference(prog->filter); + if (at_ingress) { +--- a/net/sched/act_ife.c ++++ b/net/sched/act_ife.c +@@ -718,7 +718,7 @@ static int tcf_ife_decode(struct sk_buff + u8 *tlv_data; + u16 metalen; + +- bstats_cpu_update(this_cpu_ptr(ife->common.cpu_bstats), skb); ++ bstats_update(this_cpu_ptr(ife->common.cpu_bstats), skb); + tcf_lastuse_update(&ife->tcf_tm); + + if (skb_at_tc_ingress(skb)) +@@ -806,7 +806,7 @@ static int tcf_ife_encode(struct sk_buff + exceed_mtu = true; + } + +- bstats_cpu_update(this_cpu_ptr(ife->common.cpu_bstats), skb); ++ bstats_update(this_cpu_ptr(ife->common.cpu_bstats), skb); + tcf_lastuse_update(&ife->tcf_tm); + + if (!metalen) { /* no metadata to send */ +--- a/net/sched/act_mpls.c ++++ b/net/sched/act_mpls.c +@@ -59,7 +59,7 @@ static int tcf_mpls_act(struct sk_buff * + int ret, mac_len; + + tcf_lastuse_update(&m->tcf_tm); +- bstats_cpu_update(this_cpu_ptr(m->common.cpu_bstats), skb); ++ bstats_update(this_cpu_ptr(m->common.cpu_bstats), skb); + + /* Ensure 'data' points at mac_header prior calling mpls manipulating + * functions. +--- a/net/sched/act_police.c ++++ b/net/sched/act_police.c +@@ -248,7 +248,7 @@ static int tcf_police_act(struct sk_buff + int ret; + + tcf_lastuse_update(&police->tcf_tm); +- bstats_cpu_update(this_cpu_ptr(police->common.cpu_bstats), skb); ++ bstats_update(this_cpu_ptr(police->common.cpu_bstats), skb); + + ret = READ_ONCE(police->tcf_action); + p = rcu_dereference_bh(police->params); +--- a/net/sched/act_sample.c ++++ b/net/sched/act_sample.c +@@ -163,7 +163,7 @@ static int tcf_sample_act(struct sk_buff + int retval; + + tcf_lastuse_update(&s->tcf_tm); +- bstats_cpu_update(this_cpu_ptr(s->common.cpu_bstats), skb); ++ bstats_update(this_cpu_ptr(s->common.cpu_bstats), skb); + retval = READ_ONCE(s->tcf_action); + + psample_group = rcu_dereference_bh(s->psample_group); +--- a/net/sched/act_simple.c ++++ b/net/sched/act_simple.c +@@ -36,7 +36,8 @@ static int tcf_simp_act(struct sk_buff * + * then it would look like "hello_3" (without quotes) + */ + pr_info("simple: %s_%llu\n", +- (char *)d->tcfd_defdata, d->tcf_bstats.packets); ++ (char *)d->tcfd_defdata, ++ u64_stats_read(&d->tcf_bstats.packets)); + spin_unlock(&d->tcf_lock); + return d->tcf_action; + } +--- a/net/sched/act_skbedit.c ++++ b/net/sched/act_skbedit.c +@@ -31,7 +31,7 @@ static int tcf_skbedit_act(struct sk_buf + int action; + + tcf_lastuse_update(&d->tcf_tm); +- bstats_cpu_update(this_cpu_ptr(d->common.cpu_bstats), skb); ++ bstats_update(this_cpu_ptr(d->common.cpu_bstats), skb); + + params = rcu_dereference_bh(d->params); + action = READ_ONCE(d->tcf_action); +--- a/net/sched/act_skbmod.c ++++ b/net/sched/act_skbmod.c +@@ -31,7 +31,7 @@ static int tcf_skbmod_act(struct sk_buff + u64 flags; + + tcf_lastuse_update(&d->tcf_tm); +- bstats_cpu_update(this_cpu_ptr(d->common.cpu_bstats), skb); ++ bstats_update(this_cpu_ptr(d->common.cpu_bstats), skb); + + action = READ_ONCE(d->tcf_action); + if (unlikely(action == TC_ACT_SHOT)) +--- a/net/sched/sch_api.c ++++ b/net/sched/sch_api.c +@@ -884,7 +884,7 @@ static void qdisc_offload_graft_root(str + static int tc_fill_qdisc(struct sk_buff *skb, struct Qdisc *q, u32 clid, + u32 portid, u32 seq, u16 flags, int event) + { +- struct gnet_stats_basic_cpu __percpu *cpu_bstats = NULL; ++ struct gnet_stats_basic_sync __percpu *cpu_bstats = NULL; + struct gnet_stats_queue __percpu *cpu_qstats = NULL; + struct tcmsg *tcm; + struct nlmsghdr *nlh; +--- a/net/sched/sch_atm.c ++++ b/net/sched/sch_atm.c +@@ -52,7 +52,7 @@ struct atm_flow_data { + struct atm_qdisc_data *parent; /* parent qdisc */ + struct socket *sock; /* for closing */ + int ref; /* reference count */ +- struct gnet_stats_basic_packed bstats; ++ struct gnet_stats_basic_sync bstats; + struct gnet_stats_queue qstats; + struct list_head list; + struct atm_flow_data *excess; /* flow for excess traffic; +@@ -548,7 +548,7 @@ static int atm_tc_init(struct Qdisc *sch + pr_debug("atm_tc_init(sch %p,[qdisc %p],opt %p)\n", sch, p, opt); + INIT_LIST_HEAD(&p->flows); + INIT_LIST_HEAD(&p->link.list); +- gnet_stats_basic_packed_init(&p->link.bstats); ++ gnet_stats_basic_sync_init(&p->link.bstats); + list_add(&p->link.list, &p->flows); + p->link.q = qdisc_create_dflt(sch->dev_queue, + &pfifo_qdisc_ops, sch->handle, extack); +--- a/net/sched/sch_cbq.c ++++ b/net/sched/sch_cbq.c +@@ -116,7 +116,7 @@ struct cbq_class { + long avgidle; + long deficit; /* Saved deficit for WRR */ + psched_time_t penalized; +- struct gnet_stats_basic_packed bstats; ++ struct gnet_stats_basic_sync bstats; + struct gnet_stats_queue qstats; + struct net_rate_estimator __rcu *rate_est; + struct tc_cbq_xstats xstats; +@@ -1610,7 +1610,7 @@ cbq_change_class(struct Qdisc *sch, u32 + if (cl == NULL) + goto failure; + +- gnet_stats_basic_packed_init(&cl->bstats); ++ gnet_stats_basic_sync_init(&cl->bstats); + err = tcf_block_get(&cl->block, &cl->filter_list, sch, extack); + if (err) { + kfree(cl); +--- a/net/sched/sch_drr.c ++++ b/net/sched/sch_drr.c +@@ -19,7 +19,7 @@ struct drr_class { + struct Qdisc_class_common common; + unsigned int filter_cnt; + +- struct gnet_stats_basic_packed bstats; ++ struct gnet_stats_basic_sync bstats; + struct gnet_stats_queue qstats; + struct net_rate_estimator __rcu *rate_est; + struct list_head alist; +@@ -106,7 +106,7 @@ static int drr_change_class(struct Qdisc + if (cl == NULL) + return -ENOBUFS; + +- gnet_stats_basic_packed_init(&cl->bstats); ++ gnet_stats_basic_sync_init(&cl->bstats); + cl->common.classid = classid; + cl->quantum = quantum; + cl->qdisc = qdisc_create_dflt(sch->dev_queue, +--- a/net/sched/sch_ets.c ++++ b/net/sched/sch_ets.c +@@ -41,7 +41,7 @@ struct ets_class { + struct Qdisc *qdisc; + u32 quantum; + u32 deficit; +- struct gnet_stats_basic_packed bstats; ++ struct gnet_stats_basic_sync bstats; + struct gnet_stats_queue qstats; + }; + +@@ -689,7 +689,7 @@ static int ets_qdisc_change(struct Qdisc + q->classes[i].qdisc = NULL; + q->classes[i].quantum = 0; + q->classes[i].deficit = 0; +- gnet_stats_basic_packed_init(&q->classes[i].bstats); ++ gnet_stats_basic_sync_init(&q->classes[i].bstats); + memset(&q->classes[i].qstats, 0, sizeof(q->classes[i].qstats)); + } + return 0; +--- a/net/sched/sch_generic.c ++++ b/net/sched/sch_generic.c +@@ -892,12 +892,12 @@ struct Qdisc *qdisc_alloc(struct netdev_ + __skb_queue_head_init(&sch->gso_skb); + __skb_queue_head_init(&sch->skb_bad_txq); + qdisc_skb_head_init(&sch->q); +- gnet_stats_basic_packed_init(&sch->bstats); ++ gnet_stats_basic_sync_init(&sch->bstats); + spin_lock_init(&sch->q.lock); + + if (ops->static_flags & TCQ_F_CPUSTATS) { + sch->cpu_bstats = +- netdev_alloc_pcpu_stats(struct gnet_stats_basic_cpu); ++ netdev_alloc_pcpu_stats(struct gnet_stats_basic_sync); + if (!sch->cpu_bstats) + goto errout1; + +--- a/net/sched/sch_gred.c ++++ b/net/sched/sch_gred.c +@@ -366,7 +366,7 @@ static int gred_offload_dump_stats(struc + hw_stats->parent = sch->parent; + + for (i = 0; i < MAX_DPs; i++) { +- gnet_stats_basic_packed_init(&hw_stats->stats.bstats[i]); ++ gnet_stats_basic_sync_init(&hw_stats->stats.bstats[i]); + if (table->tab[i]) + hw_stats->stats.xstats[i] = &table->tab[i]->stats; + } +@@ -378,12 +378,12 @@ static int gred_offload_dump_stats(struc + for (i = 0; i < MAX_DPs; i++) { + if (!table->tab[i]) + continue; +- table->tab[i]->packetsin += hw_stats->stats.bstats[i].packets; +- table->tab[i]->bytesin += hw_stats->stats.bstats[i].bytes; ++ table->tab[i]->packetsin += u64_stats_read(&hw_stats->stats.bstats[i].packets); ++ table->tab[i]->bytesin += u64_stats_read(&hw_stats->stats.bstats[i].bytes); + table->tab[i]->backlog += hw_stats->stats.qstats[i].backlog; + +- bytes += hw_stats->stats.bstats[i].bytes; +- packets += hw_stats->stats.bstats[i].packets; ++ bytes += u64_stats_read(&hw_stats->stats.bstats[i].bytes); ++ packets += u64_stats_read(&hw_stats->stats.bstats[i].packets); + sch->qstats.qlen += hw_stats->stats.qstats[i].qlen; + sch->qstats.backlog += hw_stats->stats.qstats[i].backlog; + sch->qstats.drops += hw_stats->stats.qstats[i].drops; +--- a/net/sched/sch_hfsc.c ++++ b/net/sched/sch_hfsc.c +@@ -111,7 +111,7 @@ enum hfsc_class_flags { + struct hfsc_class { + struct Qdisc_class_common cl_common; + +- struct gnet_stats_basic_packed bstats; ++ struct gnet_stats_basic_sync bstats; + struct gnet_stats_queue qstats; + struct net_rate_estimator __rcu *rate_est; + struct tcf_proto __rcu *filter_list; /* filter list */ +@@ -1406,7 +1406,7 @@ hfsc_init_qdisc(struct Qdisc *sch, struc + if (err) + return err; + +- gnet_stats_basic_packed_init(&q->root.bstats); ++ gnet_stats_basic_sync_init(&q->root.bstats); + q->root.cl_common.classid = sch->handle; + q->root.sched = q; + q->root.qdisc = qdisc_create_dflt(sch->dev_queue, &pfifo_qdisc_ops, +--- a/net/sched/sch_htb.c ++++ b/net/sched/sch_htb.c +@@ -113,8 +113,8 @@ struct htb_class { + /* + * Written often fields + */ +- struct gnet_stats_basic_packed bstats; +- struct gnet_stats_basic_packed bstats_bias; ++ struct gnet_stats_basic_sync bstats; ++ struct gnet_stats_basic_sync bstats_bias; + struct tc_htb_xstats xstats; /* our special stats */ + + /* token bucket parameters */ +@@ -1312,7 +1312,7 @@ static void htb_offload_aggregate_stats( + struct htb_class *c; + unsigned int i; + +- gnet_stats_basic_packed_init(&cl->bstats); ++ gnet_stats_basic_sync_init(&cl->bstats); + + for (i = 0; i < q->clhash.hashsize; i++) { + hlist_for_each_entry(c, &q->clhash.hash[i], common.hnode) { +@@ -1324,11 +1324,11 @@ static void htb_offload_aggregate_stats( + if (p != cl) + continue; + +- bytes += c->bstats_bias.bytes; +- packets += c->bstats_bias.packets; ++ bytes += u64_stats_read(&c->bstats_bias.bytes); ++ packets += u64_stats_read(&c->bstats_bias.packets); + if (c->level == 0) { +- bytes += c->leaf.q->bstats.bytes; +- packets += c->leaf.q->bstats.packets; ++ bytes += u64_stats_read(&c->leaf.q->bstats.bytes); ++ packets += u64_stats_read(&c->leaf.q->bstats.packets); + } + } + } +@@ -1359,10 +1359,10 @@ htb_dump_class_stats(struct Qdisc *sch, + if (cl->leaf.q) + cl->bstats = cl->leaf.q->bstats; + else +- gnet_stats_basic_packed_init(&cl->bstats); ++ gnet_stats_basic_sync_init(&cl->bstats); + _bstats_update(&cl->bstats, +- cl->bstats_bias.bytes, +- cl->bstats_bias.packets); ++ u64_stats_read(&cl->bstats_bias.bytes), ++ u64_stats_read(&cl->bstats_bias.packets)); + } else { + htb_offload_aggregate_stats(q, cl); + } +@@ -1582,8 +1582,8 @@ static int htb_destroy_class_offload(str + + if (cl->parent) { + _bstats_update(&cl->parent->bstats_bias, +- q->bstats.bytes, +- q->bstats.packets); ++ u64_stats_read(&q->bstats.bytes), ++ u64_stats_read(&q->bstats.packets)); + } + + offload_opt = (struct tc_htb_qopt_offload) { +@@ -1853,8 +1853,8 @@ static int htb_change_class(struct Qdisc + if (!cl) + goto failure; + +- gnet_stats_basic_packed_init(&cl->bstats); +- gnet_stats_basic_packed_init(&cl->bstats_bias); ++ gnet_stats_basic_sync_init(&cl->bstats); ++ gnet_stats_basic_sync_init(&cl->bstats_bias); + + err = tcf_block_get(&cl->block, &cl->filter_list, sch, extack); + if (err) { +@@ -1930,8 +1930,8 @@ static int htb_change_class(struct Qdisc + goto err_kill_estimator; + } + _bstats_update(&parent->bstats_bias, +- old_q->bstats.bytes, +- old_q->bstats.packets); ++ u64_stats_read(&old_q->bstats.bytes), ++ u64_stats_read(&old_q->bstats.packets)); + qdisc_put(old_q); + } + new_q = qdisc_create_dflt(dev_queue, &pfifo_qdisc_ops, +--- a/net/sched/sch_mq.c ++++ b/net/sched/sch_mq.c +@@ -132,7 +132,7 @@ static int mq_dump(struct Qdisc *sch, st + unsigned int ntx; + + sch->q.qlen = 0; +- gnet_stats_basic_packed_init(&sch->bstats); ++ gnet_stats_basic_sync_init(&sch->bstats); + memset(&sch->qstats, 0, sizeof(sch->qstats)); + + /* MQ supports lockless qdiscs. However, statistics accounting needs +--- a/net/sched/sch_mqprio.c ++++ b/net/sched/sch_mqprio.c +@@ -390,7 +390,7 @@ static int mqprio_dump(struct Qdisc *sch + unsigned int ntx, tc; + + sch->q.qlen = 0; +- gnet_stats_basic_packed_init(&sch->bstats); ++ gnet_stats_basic_sync_init(&sch->bstats); + memset(&sch->qstats, 0, sizeof(sch->qstats)); + + /* MQ supports lockless qdiscs. However, statistics accounting needs +@@ -500,11 +500,11 @@ static int mqprio_dump_class_stats(struc + int i; + __u32 qlen; + struct gnet_stats_queue qstats = {0}; +- struct gnet_stats_basic_packed bstats; ++ struct gnet_stats_basic_sync bstats; + struct net_device *dev = qdisc_dev(sch); + struct netdev_tc_txq tc = dev->tc_to_txq[cl & TC_BITMASK]; + +- gnet_stats_basic_packed_init(&bstats); ++ gnet_stats_basic_sync_init(&bstats); + /* Drop lock here it will be reclaimed before touching + * statistics this is required because the d->lock we + * hold here is the look on dev_queue->qdisc_sleeping +--- a/net/sched/sch_qfq.c ++++ b/net/sched/sch_qfq.c +@@ -131,7 +131,7 @@ struct qfq_class { + + unsigned int filter_cnt; + +- struct gnet_stats_basic_packed bstats; ++ struct gnet_stats_basic_sync bstats; + struct gnet_stats_queue qstats; + struct net_rate_estimator __rcu *rate_est; + struct Qdisc *qdisc; +@@ -465,7 +465,7 @@ static int qfq_change_class(struct Qdisc + if (cl == NULL) + return -ENOBUFS; + +- gnet_stats_basic_packed_init(&cl->bstats); ++ gnet_stats_basic_sync_init(&cl->bstats); + cl->common.classid = classid; + cl->deficit = lmax; + diff --git a/debian/patches-rt/0008-sched-Fix-hotplug-vs-CPU-bandwidth-control.patch b/debian/patches-rt/0008-sched-Fix-hotplug-vs-CPU-bandwidth-control.patch deleted file mode 100644 index 45f5fdfd5..000000000 --- a/debian/patches-rt/0008-sched-Fix-hotplug-vs-CPU-bandwidth-control.patch +++ /dev/null @@ -1,94 +0,0 @@ -From 7e826e336f66907d688e4c457f2104618d960de3 Mon Sep 17 00:00:00 2001 -From: Peter Zijlstra <peterz@infradead.org> -Date: Fri, 23 Oct 2020 12:12:05 +0200 -Subject: [PATCH 008/296] sched: Fix hotplug vs CPU bandwidth control -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Since we now migrate tasks away before DYING, we should also move -bandwidth unthrottle, otherwise we can gain tasks from unthrottle -after we expect all tasks to be gone already. - -Also; it looks like the RT balancers don't respect cpu_active() and -instead rely on rq->online in part, complete this. This too requires -we do set_rq_offline() earlier to match the cpu_active() semantics. -(The bigger patch is to convert RT to cpu_active() entirely) - -Since set_rq_online() is called from sched_cpu_activate(), place -set_rq_offline() in sched_cpu_deactivate(). - -Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/sched/core.c | 14 ++++++++++---- - kernel/sched/deadline.c | 2 +- - kernel/sched/rt.c | 2 +- - 3 files changed, 12 insertions(+), 6 deletions(-) - -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index 9ad43e648a78..f7403832b3b1 100644 ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -6975,6 +6975,8 @@ int sched_cpu_activate(unsigned int cpu) - - int sched_cpu_deactivate(unsigned int cpu) - { -+ struct rq *rq = cpu_rq(cpu); -+ struct rq_flags rf; - int ret; - - set_cpu_active(cpu, false); -@@ -6989,6 +6991,14 @@ int sched_cpu_deactivate(unsigned int cpu) - - balance_push_set(cpu, true); - -+ rq_lock_irqsave(rq, &rf); -+ if (rq->rd) { -+ update_rq_clock(rq); -+ BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span)); -+ set_rq_offline(rq); -+ } -+ rq_unlock_irqrestore(rq, &rf); -+ - #ifdef CONFIG_SCHED_SMT - /* - * When going down, decrement the number of cores with SMT present. -@@ -7070,10 +7080,6 @@ int sched_cpu_dying(unsigned int cpu) - sched_tick_stop(cpu); - - rq_lock_irqsave(rq, &rf); -- if (rq->rd) { -- BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span)); -- set_rq_offline(rq); -- } - BUG_ON(rq->nr_running != 1); - rq_unlock_irqrestore(rq, &rf); - -diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c -index 8d06d1f4e2f7..e0564abfece1 100644 ---- a/kernel/sched/deadline.c -+++ b/kernel/sched/deadline.c -@@ -565,7 +565,7 @@ static int push_dl_task(struct rq *rq); - - static inline bool need_pull_dl_task(struct rq *rq, struct task_struct *prev) - { -- return dl_task(prev); -+ return rq->online && dl_task(prev); - } - - static DEFINE_PER_CPU(struct callback_head, dl_push_head); -diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c -index 49ec096a8aa1..40a46639f78a 100644 ---- a/kernel/sched/rt.c -+++ b/kernel/sched/rt.c -@@ -265,7 +265,7 @@ static void pull_rt_task(struct rq *this_rq); - static inline bool need_pull_rt_task(struct rq *rq, struct task_struct *prev) - { - /* Try to pull RT tasks here if we lower this rq's prio */ -- return rq->rt.highest_prio.curr > prev->prio; -+ return rq->online && rq->rt.highest_prio.curr > prev->prio; - } - - static inline int rt_overloaded(struct rq *rq) --- -2.30.2 - diff --git a/debian/patches-rt/0008_locking_rt_take_rcu_nesting_into_account_for___might_resched.patch b/debian/patches-rt/0008_locking_rt_take_rcu_nesting_into_account_for___might_resched.patch new file mode 100644 index 000000000..1fa08e1bd --- /dev/null +++ b/debian/patches-rt/0008_locking_rt_take_rcu_nesting_into_account_for___might_resched.patch @@ -0,0 +1,78 @@ +From: Thomas Gleixner <tglx@linutronix.de> +Subject: locking/rt: Take RCU nesting into account for __might_resched() +Date: Thu, 23 Sep 2021 18:54:46 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The general rule that rcu_read_lock() held sections cannot voluntary sleep +does apply even on RT kernels. Though the substitution of spin/rw locks on +RT enabled kernels has to be exempt from that rule. On !RT a spin_lock() +can obviously nest inside a RCU read side critical section as the lock +acquisition is not going to block, but on RT this is not longer the case +due to the 'sleeping' spinlock substitution. + +The RT patches contained a cheap hack to ignore the RCU nesting depth in +might_sleep() checks, which was a pragmatic but incorrect workaround. + +Instead of generally ignoring the RCU nesting depth in __might_sleep() and +__might_resched() checks, pass the rcu_preempt_depth() via the offsets +argument to __might_resched() from spin/read/write_lock() which makes the +checks work correctly even in RCU read side critical sections. + +The actual blocking on such a substituted lock within a RCU read side +critical section is already handled correctly in __schedule() by treating +it as a "preemption" of the RCU read side critical section. + +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20210923165358.368305497@linutronix.de +--- + kernel/locking/spinlock_rt.c | 17 ++++++++++++++--- + 1 file changed, 14 insertions(+), 3 deletions(-) + +--- a/kernel/locking/spinlock_rt.c ++++ b/kernel/locking/spinlock_rt.c +@@ -24,6 +24,17 @@ + #define RT_MUTEX_BUILD_SPINLOCKS + #include "rtmutex.c" + ++/* ++ * __might_resched() skips the state check as rtlocks are state ++ * preserving. Take RCU nesting into account as spin/read/write_lock() can ++ * legitimately nest into an RCU read side critical section. ++ */ ++#define RTLOCK_RESCHED_OFFSETS \ ++ (rcu_preempt_depth() << MIGHT_RESCHED_RCU_SHIFT) ++ ++#define rtlock_might_resched() \ ++ __might_resched(__FILE__, __LINE__, RTLOCK_RESCHED_OFFSETS) ++ + static __always_inline void rtlock_lock(struct rt_mutex_base *rtm) + { + if (unlikely(!rt_mutex_cmpxchg_acquire(rtm, NULL, current))) +@@ -32,7 +43,7 @@ static __always_inline void rtlock_lock( + + static __always_inline void __rt_spin_lock(spinlock_t *lock) + { +- __might_resched(__FILE__, __LINE__, 0); ++ rtlock_might_resched(); + rtlock_lock(&lock->lock); + rcu_read_lock(); + migrate_disable(); +@@ -210,7 +221,7 @@ EXPORT_SYMBOL(rt_write_trylock); + + void __sched rt_read_lock(rwlock_t *rwlock) + { +- __might_resched(__FILE__, __LINE__, 0); ++ rtlock_might_resched(); + rwlock_acquire_read(&rwlock->dep_map, 0, 0, _RET_IP_); + rwbase_read_lock(&rwlock->rwbase, TASK_RTLOCK_WAIT); + rcu_read_lock(); +@@ -220,7 +231,7 @@ EXPORT_SYMBOL(rt_read_lock); + + void __sched rt_write_lock(rwlock_t *rwlock) + { +- __might_resched(__FILE__, __LINE__, 0); ++ rtlock_might_resched(); + rwlock_acquire(&rwlock->dep_map, 0, 0, _RET_IP_); + rwbase_write_lock(&rwlock->rwbase, TASK_RTLOCK_WAIT); + rcu_read_lock(); diff --git a/debian/patches-rt/0009-drm-i915-gt-Use-spin_lock_irq-instead-of-local_irq_d.patch b/debian/patches-rt/0009-drm-i915-gt-Use-spin_lock_irq-instead-of-local_irq_d.patch new file mode 100644 index 000000000..c06c4c33a --- /dev/null +++ b/debian/patches-rt/0009-drm-i915-gt-Use-spin_lock_irq-instead-of-local_irq_d.patch @@ -0,0 +1,89 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Wed, 8 Sep 2021 19:03:41 +0200 +Subject: [PATCH 09/10] drm/i915/gt: Use spin_lock_irq() instead of + local_irq_disable() + spin_lock() +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +execlists_dequeue() is invoked from a function which uses +local_irq_disable() to disable interrupts so the spin_lock() behaves +like spin_lock_irq(). +This breaks PREEMPT_RT because local_irq_disable() + spin_lock() is not +the same as spin_lock_irq(). + +execlists_dequeue_irq() and execlists_dequeue() has each one caller +only. If intel_engine_cs::active::lock is acquired and released with the +_irq suffix then it behaves almost as if execlists_dequeue() would be +invoked with disabled interrupts. The difference is the last part of the +function which is then invoked with enabled interrupts. +I can't tell if this makes a difference. From looking at it, it might +work to move the last unlock at the end of the function as I didn't find +anything that would acquire the lock again. + +Reported-by: Clark Williams <williams@redhat.com> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> +--- + drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 17 +++++------------ + 1 file changed, 5 insertions(+), 12 deletions(-) + +--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c ++++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +@@ -1283,7 +1283,7 @@ static void execlists_dequeue(struct int + * and context switches) submission. + */ + +- spin_lock(&sched_engine->lock); ++ spin_lock_irq(&sched_engine->lock); + + /* + * If the queue is higher priority than the last +@@ -1383,7 +1383,7 @@ static void execlists_dequeue(struct int + * Even if ELSP[1] is occupied and not worthy + * of timeslices, our queue might be. + */ +- spin_unlock(&sched_engine->lock); ++ spin_unlock_irq(&sched_engine->lock); + return; + } + } +@@ -1409,7 +1409,7 @@ static void execlists_dequeue(struct int + + if (last && !can_merge_rq(last, rq)) { + spin_unlock(&ve->base.sched_engine->lock); +- spin_unlock(&engine->sched_engine->lock); ++ spin_unlock_irq(&engine->sched_engine->lock); + return; /* leave this for another sibling */ + } + +@@ -1571,7 +1571,7 @@ static void execlists_dequeue(struct int + */ + sched_engine->queue_priority_hint = queue_prio(sched_engine); + i915_sched_engine_reset_on_empty(sched_engine); +- spin_unlock(&sched_engine->lock); ++ spin_unlock_irq(&sched_engine->lock); + + /* + * We can skip poking the HW if we ended up with exactly the same set +@@ -1597,13 +1597,6 @@ static void execlists_dequeue(struct int + } + } + +-static void execlists_dequeue_irq(struct intel_engine_cs *engine) +-{ +- local_irq_disable(); /* Suspend interrupts across request submission */ +- execlists_dequeue(engine); +- local_irq_enable(); /* flush irq_work (e.g. breadcrumb enabling) */ +-} +- + static void clear_ports(struct i915_request **ports, int count) + { + memset_p((void **)ports, NULL, count); +@@ -2427,7 +2420,7 @@ static void execlists_submission_tasklet + } + + if (!engine->execlists.pending[0]) { +- execlists_dequeue_irq(engine); ++ execlists_dequeue(engine); + start_timeslice(engine); + } + diff --git a/debian/patches-rt/0009-net-sched-Remove-Qdisc-running-sequence-counter.patch b/debian/patches-rt/0009-net-sched-Remove-Qdisc-running-sequence-counter.patch new file mode 100644 index 000000000..89be30417 --- /dev/null +++ b/debian/patches-rt/0009-net-sched-Remove-Qdisc-running-sequence-counter.patch @@ -0,0 +1,817 @@ +From: "Ahmed S. Darwish" <a.darwish@linutronix.de> +Date: Sat, 16 Oct 2021 10:49:10 +0200 +Subject: [PATCH 9/9] net: sched: Remove Qdisc::running sequence counter +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The Qdisc::running sequence counter has two uses: + + 1. Reliably reading qdisc's tc statistics while the qdisc is running + (a seqcount read/retry loop at gnet_stats_add_basic()). + + 2. As a flag, indicating whether the qdisc in question is running + (without any retry loops). + +For the first usage, the Qdisc::running sequence counter write section, +qdisc_run_begin() => qdisc_run_end(), covers a much wider area than what +is actually needed: the raw qdisc's bstats update. A u64_stats sync +point was thus introduced (in previous commits) inside the bstats +structure itself. A local u64_stats write section is then started and +stopped for the bstats updates. + +Use that u64_stats sync point mechanism for the bstats read/retry loop +at gnet_stats_add_basic(). + +For the second qdisc->running usage, a __QDISC_STATE_RUNNING bit flag, +accessed with atomic bitops, is sufficient. Using a bit flag instead of +a sequence counter at qdisc_run_begin/end() and qdisc_is_running() leads +to the SMP barriers implicitly added through raw_read_seqcount() and +write_seqcount_begin/end() getting removed. All call sites have been +surveyed though, and no required ordering was identified. + +Now that the qdisc->running sequence counter is no longer used, remove +it. + +Note, using u64_stats implies no sequence counter protection for 64-bit +architectures. This can lead to the qdisc tc statistics "packets" vs. +"bytes" values getting out of sync on rare occasions. The individual +values will still be valid. + +Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: David S. Miller <davem@davemloft.net> +--- + include/linux/netdevice.h | 4 --- + include/net/gen_stats.h | 19 +++++++---------- + include/net/sch_generic.h | 33 ++++++++++++------------------ + net/core/gen_estimator.c | 16 +++++++++----- + net/core/gen_stats.c | 50 +++++++++++++++++++++++++--------------------- + net/sched/act_api.c | 9 ++++---- + net/sched/act_police.c | 2 - + net/sched/sch_api.c | 16 ++------------ + net/sched/sch_atm.c | 3 -- + net/sched/sch_cbq.c | 9 ++------ + net/sched/sch_drr.c | 10 ++------- + net/sched/sch_ets.c | 3 -- + net/sched/sch_generic.c | 10 +-------- + net/sched/sch_hfsc.c | 8 ++----- + net/sched/sch_htb.c | 7 ++---- + net/sched/sch_mq.c | 7 ++---- + net/sched/sch_mqprio.c | 14 ++++++------ + net/sched/sch_multiq.c | 3 -- + net/sched/sch_prio.c | 4 +-- + net/sched/sch_qfq.c | 7 ++---- + net/sched/sch_taprio.c | 2 - + 21 files changed, 102 insertions(+), 134 deletions(-) + +--- a/include/linux/netdevice.h ++++ b/include/linux/netdevice.h +@@ -1916,7 +1916,6 @@ enum netdev_ml_priv_type { + * @sfp_bus: attached &struct sfp_bus structure. + * + * @qdisc_tx_busylock: lockdep class annotating Qdisc->busylock spinlock +- * @qdisc_running_key: lockdep class annotating Qdisc->running seqcount + * + * @proto_down: protocol port state information can be sent to the + * switch driver and used to set the phys state of the +@@ -2250,7 +2249,6 @@ struct net_device { + struct phy_device *phydev; + struct sfp_bus *sfp_bus; + struct lock_class_key *qdisc_tx_busylock; +- struct lock_class_key *qdisc_running_key; + bool proto_down; + unsigned wol_enabled:1; + unsigned threaded:1; +@@ -2360,13 +2358,11 @@ static inline void netdev_for_each_tx_qu + #define netdev_lockdep_set_classes(dev) \ + { \ + static struct lock_class_key qdisc_tx_busylock_key; \ +- static struct lock_class_key qdisc_running_key; \ + static struct lock_class_key qdisc_xmit_lock_key; \ + static struct lock_class_key dev_addr_list_lock_key; \ + unsigned int i; \ + \ + (dev)->qdisc_tx_busylock = &qdisc_tx_busylock_key; \ +- (dev)->qdisc_running_key = &qdisc_running_key; \ + lockdep_set_class(&(dev)->addr_list_lock, \ + &dev_addr_list_lock_key); \ + for (i = 0; i < (dev)->num_tx_queues; i++) \ +--- a/include/net/gen_stats.h ++++ b/include/net/gen_stats.h +@@ -46,18 +46,15 @@ int gnet_stats_start_copy_compat(struct + spinlock_t *lock, struct gnet_dump *d, + int padattr); + +-int gnet_stats_copy_basic(const seqcount_t *running, +- struct gnet_dump *d, ++int gnet_stats_copy_basic(struct gnet_dump *d, + struct gnet_stats_basic_sync __percpu *cpu, +- struct gnet_stats_basic_sync *b); +-void gnet_stats_add_basic(const seqcount_t *running, +- struct gnet_stats_basic_sync *bstats, ++ struct gnet_stats_basic_sync *b, bool running); ++void gnet_stats_add_basic(struct gnet_stats_basic_sync *bstats, + struct gnet_stats_basic_sync __percpu *cpu, +- struct gnet_stats_basic_sync *b); +-int gnet_stats_copy_basic_hw(const seqcount_t *running, +- struct gnet_dump *d, ++ struct gnet_stats_basic_sync *b, bool running); ++int gnet_stats_copy_basic_hw(struct gnet_dump *d, + struct gnet_stats_basic_sync __percpu *cpu, +- struct gnet_stats_basic_sync *b); ++ struct gnet_stats_basic_sync *b, bool running); + int gnet_stats_copy_rate_est(struct gnet_dump *d, + struct net_rate_estimator __rcu **ptr); + int gnet_stats_copy_queue(struct gnet_dump *d, +@@ -74,13 +71,13 @@ int gen_new_estimator(struct gnet_stats_ + struct gnet_stats_basic_sync __percpu *cpu_bstats, + struct net_rate_estimator __rcu **rate_est, + spinlock_t *lock, +- seqcount_t *running, struct nlattr *opt); ++ bool running, struct nlattr *opt); + void gen_kill_estimator(struct net_rate_estimator __rcu **ptr); + int gen_replace_estimator(struct gnet_stats_basic_sync *bstats, + struct gnet_stats_basic_sync __percpu *cpu_bstats, + struct net_rate_estimator __rcu **ptr, + spinlock_t *lock, +- seqcount_t *running, struct nlattr *opt); ++ bool running, struct nlattr *opt); + bool gen_estimator_active(struct net_rate_estimator __rcu **ptr); + bool gen_estimator_read(struct net_rate_estimator __rcu **ptr, + struct gnet_stats_rate_est64 *sample); +--- a/include/net/sch_generic.h ++++ b/include/net/sch_generic.h +@@ -38,6 +38,10 @@ enum qdisc_state_t { + __QDISC_STATE_DEACTIVATED, + __QDISC_STATE_MISSED, + __QDISC_STATE_DRAINING, ++ /* Only for !TCQ_F_NOLOCK qdisc. Never access it directly. ++ * Use qdisc_run_begin/end() or qdisc_is_running() instead. ++ */ ++ __QDISC_STATE_RUNNING, + }; + + #define QDISC_STATE_MISSED BIT(__QDISC_STATE_MISSED) +@@ -108,7 +112,6 @@ struct Qdisc { + struct sk_buff_head gso_skb ____cacheline_aligned_in_smp; + struct qdisc_skb_head q; + struct gnet_stats_basic_sync bstats; +- seqcount_t running; + struct gnet_stats_queue qstats; + unsigned long state; + struct Qdisc *next_sched; +@@ -143,11 +146,15 @@ static inline struct Qdisc *qdisc_refcou + return NULL; + } + ++/* For !TCQ_F_NOLOCK qdisc: callers must either call this within a qdisc ++ * root_lock section, or provide their own memory barriers -- ordering ++ * against qdisc_run_begin/end() atomic bit operations. ++ */ + static inline bool qdisc_is_running(struct Qdisc *qdisc) + { + if (qdisc->flags & TCQ_F_NOLOCK) + return spin_is_locked(&qdisc->seqlock); +- return (raw_read_seqcount(&qdisc->running) & 1) ? true : false; ++ return test_bit(__QDISC_STATE_RUNNING, &qdisc->state); + } + + static inline bool nolock_qdisc_is_empty(const struct Qdisc *qdisc) +@@ -167,6 +174,9 @@ static inline bool qdisc_is_empty(const + return !READ_ONCE(qdisc->q.qlen); + } + ++/* For !TCQ_F_NOLOCK qdisc, qdisc_run_begin/end() must be invoked with ++ * the qdisc root lock acquired. ++ */ + static inline bool qdisc_run_begin(struct Qdisc *qdisc) + { + if (qdisc->flags & TCQ_F_NOLOCK) { +@@ -206,15 +216,8 @@ static inline bool qdisc_run_begin(struc + * after it releases the lock at the end of qdisc_run_end(). + */ + return spin_trylock(&qdisc->seqlock); +- } else if (qdisc_is_running(qdisc)) { +- return false; + } +- /* Variant of write_seqcount_begin() telling lockdep a trylock +- * was attempted. +- */ +- raw_write_seqcount_begin(&qdisc->running); +- seqcount_acquire(&qdisc->running.dep_map, 0, 1, _RET_IP_); +- return true; ++ return test_and_set_bit(__QDISC_STATE_RUNNING, &qdisc->state); + } + + static inline void qdisc_run_end(struct Qdisc *qdisc) +@@ -226,7 +229,7 @@ static inline void qdisc_run_end(struct + &qdisc->state))) + __netif_schedule(qdisc); + } else { +- write_seqcount_end(&qdisc->running); ++ clear_bit(__QDISC_STATE_RUNNING, &qdisc->state); + } + } + +@@ -590,14 +593,6 @@ static inline spinlock_t *qdisc_root_sle + return qdisc_lock(root); + } + +-static inline seqcount_t *qdisc_root_sleeping_running(const struct Qdisc *qdisc) +-{ +- struct Qdisc *root = qdisc_root_sleeping(qdisc); +- +- ASSERT_RTNL(); +- return &root->running; +-} +- + static inline struct net_device *qdisc_dev(const struct Qdisc *qdisc) + { + return qdisc->dev_queue->dev; +--- a/net/core/gen_estimator.c ++++ b/net/core/gen_estimator.c +@@ -42,7 +42,7 @@ + struct net_rate_estimator { + struct gnet_stats_basic_sync *bstats; + spinlock_t *stats_lock; +- seqcount_t *running; ++ bool running; + struct gnet_stats_basic_sync __percpu *cpu_bstats; + u8 ewma_log; + u8 intvl_log; /* period : (250ms << intvl_log) */ +@@ -66,7 +66,7 @@ static void est_fetch_counters(struct ne + if (e->stats_lock) + spin_lock(e->stats_lock); + +- gnet_stats_add_basic(e->running, b, e->cpu_bstats, e->bstats); ++ gnet_stats_add_basic(b, e->cpu_bstats, e->bstats, e->running); + + if (e->stats_lock) + spin_unlock(e->stats_lock); +@@ -113,7 +113,9 @@ static void est_timer(struct timer_list + * @cpu_bstats: bstats per cpu + * @rate_est: rate estimator statistics + * @lock: lock for statistics and control path +- * @running: qdisc running seqcount ++ * @running: true if @bstats represents a running qdisc, thus @bstats' ++ * internal values might change during basic reads. Only used ++ * if @bstats_cpu is NULL + * @opt: rate estimator configuration TLV + * + * Creates a new rate estimator with &bstats as source and &rate_est +@@ -129,7 +131,7 @@ int gen_new_estimator(struct gnet_stats_ + struct gnet_stats_basic_sync __percpu *cpu_bstats, + struct net_rate_estimator __rcu **rate_est, + spinlock_t *lock, +- seqcount_t *running, ++ bool running, + struct nlattr *opt) + { + struct gnet_estimator *parm = nla_data(opt); +@@ -218,7 +220,9 @@ EXPORT_SYMBOL(gen_kill_estimator); + * @cpu_bstats: bstats per cpu + * @rate_est: rate estimator statistics + * @lock: lock for statistics and control path +- * @running: qdisc running seqcount (might be NULL) ++ * @running: true if @bstats represents a running qdisc, thus @bstats' ++ * internal values might change during basic reads. Only used ++ * if @cpu_bstats is NULL + * @opt: rate estimator configuration TLV + * + * Replaces the configuration of a rate estimator by calling +@@ -230,7 +234,7 @@ int gen_replace_estimator(struct gnet_st + struct gnet_stats_basic_sync __percpu *cpu_bstats, + struct net_rate_estimator __rcu **rate_est, + spinlock_t *lock, +- seqcount_t *running, struct nlattr *opt) ++ bool running, struct nlattr *opt) + { + return gen_new_estimator(bstats, cpu_bstats, rate_est, + lock, running, opt); +--- a/net/core/gen_stats.c ++++ b/net/core/gen_stats.c +@@ -146,42 +146,42 @@ static void gnet_stats_add_basic_cpu(str + _bstats_update(bstats, t_bytes, t_packets); + } + +-void gnet_stats_add_basic(const seqcount_t *running, +- struct gnet_stats_basic_sync *bstats, ++void gnet_stats_add_basic(struct gnet_stats_basic_sync *bstats, + struct gnet_stats_basic_sync __percpu *cpu, +- struct gnet_stats_basic_sync *b) ++ struct gnet_stats_basic_sync *b, bool running) + { +- unsigned int seq; ++ unsigned int start; + u64 bytes = 0; + u64 packets = 0; + ++ WARN_ON_ONCE((cpu || running) && !in_task()); ++ + if (cpu) { + gnet_stats_add_basic_cpu(bstats, cpu); + return; + } + do { + if (running) +- seq = read_seqcount_begin(running); ++ start = u64_stats_fetch_begin_irq(&b->syncp); + bytes = u64_stats_read(&b->bytes); + packets = u64_stats_read(&b->packets); +- } while (running && read_seqcount_retry(running, seq)); ++ } while (running && u64_stats_fetch_retry_irq(&b->syncp, start)); + + _bstats_update(bstats, bytes, packets); + } + EXPORT_SYMBOL(gnet_stats_add_basic); + + static int +-___gnet_stats_copy_basic(const seqcount_t *running, +- struct gnet_dump *d, ++___gnet_stats_copy_basic(struct gnet_dump *d, + struct gnet_stats_basic_sync __percpu *cpu, + struct gnet_stats_basic_sync *b, +- int type) ++ int type, bool running) + { + struct gnet_stats_basic_sync bstats; + u64 bstats_bytes, bstats_packets; + + gnet_stats_basic_sync_init(&bstats); +- gnet_stats_add_basic(running, &bstats, cpu, b); ++ gnet_stats_add_basic(&bstats, cpu, b, running); + + bstats_bytes = u64_stats_read(&bstats.bytes); + bstats_packets = u64_stats_read(&bstats.packets); +@@ -210,10 +210,14 @@ static int + + /** + * gnet_stats_copy_basic - copy basic statistics into statistic TLV +- * @running: seqcount_t pointer + * @d: dumping handle + * @cpu: copy statistic per cpu + * @b: basic statistics ++ * @running: true if @b represents a running qdisc, thus @b's ++ * internal values might change during basic reads. ++ * Only used if @cpu is NULL ++ * ++ * Context: task; must not be run from IRQ or BH contexts + * + * Appends the basic statistics to the top level TLV created by + * gnet_stats_start_copy(). +@@ -222,22 +226,25 @@ static int + * if the room in the socket buffer was not sufficient. + */ + int +-gnet_stats_copy_basic(const seqcount_t *running, +- struct gnet_dump *d, ++gnet_stats_copy_basic(struct gnet_dump *d, + struct gnet_stats_basic_sync __percpu *cpu, +- struct gnet_stats_basic_sync *b) ++ struct gnet_stats_basic_sync *b, ++ bool running) + { +- return ___gnet_stats_copy_basic(running, d, cpu, b, +- TCA_STATS_BASIC); ++ return ___gnet_stats_copy_basic(d, cpu, b, TCA_STATS_BASIC, running); + } + EXPORT_SYMBOL(gnet_stats_copy_basic); + + /** + * gnet_stats_copy_basic_hw - copy basic hw statistics into statistic TLV +- * @running: seqcount_t pointer + * @d: dumping handle + * @cpu: copy statistic per cpu + * @b: basic statistics ++ * @running: true if @b represents a running qdisc, thus @b's ++ * internal values might change during basic reads. ++ * Only used if @cpu is NULL ++ * ++ * Context: task; must not be run from IRQ or BH contexts + * + * Appends the basic statistics to the top level TLV created by + * gnet_stats_start_copy(). +@@ -246,13 +253,12 @@ EXPORT_SYMBOL(gnet_stats_copy_basic); + * if the room in the socket buffer was not sufficient. + */ + int +-gnet_stats_copy_basic_hw(const seqcount_t *running, +- struct gnet_dump *d, ++gnet_stats_copy_basic_hw(struct gnet_dump *d, + struct gnet_stats_basic_sync __percpu *cpu, +- struct gnet_stats_basic_sync *b) ++ struct gnet_stats_basic_sync *b, ++ bool running) + { +- return ___gnet_stats_copy_basic(running, d, cpu, b, +- TCA_STATS_BASIC_HW); ++ return ___gnet_stats_copy_basic(d, cpu, b, TCA_STATS_BASIC_HW, running); + } + EXPORT_SYMBOL(gnet_stats_copy_basic_hw); + +--- a/net/sched/act_api.c ++++ b/net/sched/act_api.c +@@ -501,7 +501,7 @@ int tcf_idr_create(struct tc_action_net + if (est) { + err = gen_new_estimator(&p->tcfa_bstats, p->cpu_bstats, + &p->tcfa_rate_est, +- &p->tcfa_lock, NULL, est); ++ &p->tcfa_lock, false, est); + if (err) + goto err4; + } +@@ -1173,9 +1173,10 @@ int tcf_action_copy_stats(struct sk_buff + if (err < 0) + goto errout; + +- if (gnet_stats_copy_basic(NULL, &d, p->cpu_bstats, &p->tcfa_bstats) < 0 || +- gnet_stats_copy_basic_hw(NULL, &d, p->cpu_bstats_hw, +- &p->tcfa_bstats_hw) < 0 || ++ if (gnet_stats_copy_basic(&d, p->cpu_bstats, ++ &p->tcfa_bstats, false) < 0 || ++ gnet_stats_copy_basic_hw(&d, p->cpu_bstats_hw, ++ &p->tcfa_bstats_hw, false) < 0 || + gnet_stats_copy_rate_est(&d, &p->tcfa_rate_est) < 0 || + gnet_stats_copy_queue(&d, p->cpu_qstats, + &p->tcfa_qstats, +--- a/net/sched/act_police.c ++++ b/net/sched/act_police.c +@@ -125,7 +125,7 @@ static int tcf_police_init(struct net *n + police->common.cpu_bstats, + &police->tcf_rate_est, + &police->tcf_lock, +- NULL, est); ++ false, est); + if (err) + goto failure; + } else if (tb[TCA_POLICE_AVRATE] && +--- a/net/sched/sch_api.c ++++ b/net/sched/sch_api.c +@@ -942,8 +942,7 @@ static int tc_fill_qdisc(struct sk_buff + cpu_qstats = q->cpu_qstats; + } + +- if (gnet_stats_copy_basic(qdisc_root_sleeping_running(q), +- &d, cpu_bstats, &q->bstats) < 0 || ++ if (gnet_stats_copy_basic(&d, cpu_bstats, &q->bstats, true) < 0 || + gnet_stats_copy_rate_est(&d, &q->rate_est) < 0 || + gnet_stats_copy_queue(&d, cpu_qstats, &q->qstats, qlen) < 0) + goto nla_put_failure; +@@ -1264,26 +1263,17 @@ static struct Qdisc *qdisc_create(struct + rcu_assign_pointer(sch->stab, stab); + } + if (tca[TCA_RATE]) { +- seqcount_t *running; +- + err = -EOPNOTSUPP; + if (sch->flags & TCQ_F_MQROOT) { + NL_SET_ERR_MSG(extack, "Cannot attach rate estimator to a multi-queue root qdisc"); + goto err_out4; + } + +- if (sch->parent != TC_H_ROOT && +- !(sch->flags & TCQ_F_INGRESS) && +- (!p || !(p->flags & TCQ_F_MQROOT))) +- running = qdisc_root_sleeping_running(sch); +- else +- running = &sch->running; +- + err = gen_new_estimator(&sch->bstats, + sch->cpu_bstats, + &sch->rate_est, + NULL, +- running, ++ true, + tca[TCA_RATE]); + if (err) { + NL_SET_ERR_MSG(extack, "Failed to generate new estimator"); +@@ -1359,7 +1349,7 @@ static int qdisc_change(struct Qdisc *sc + sch->cpu_bstats, + &sch->rate_est, + NULL, +- qdisc_root_sleeping_running(sch), ++ true, + tca[TCA_RATE]); + } + out: +--- a/net/sched/sch_atm.c ++++ b/net/sched/sch_atm.c +@@ -653,8 +653,7 @@ atm_tc_dump_class_stats(struct Qdisc *sc + { + struct atm_flow_data *flow = (struct atm_flow_data *)arg; + +- if (gnet_stats_copy_basic(qdisc_root_sleeping_running(sch), +- d, NULL, &flow->bstats) < 0 || ++ if (gnet_stats_copy_basic(d, NULL, &flow->bstats, true) < 0 || + gnet_stats_copy_queue(d, NULL, &flow->qstats, flow->q->q.qlen) < 0) + return -1; + +--- a/net/sched/sch_cbq.c ++++ b/net/sched/sch_cbq.c +@@ -1383,8 +1383,7 @@ cbq_dump_class_stats(struct Qdisc *sch, + if (cl->undertime != PSCHED_PASTPERFECT) + cl->xstats.undertime = cl->undertime - q->now; + +- if (gnet_stats_copy_basic(qdisc_root_sleeping_running(sch), +- d, NULL, &cl->bstats) < 0 || ++ if (gnet_stats_copy_basic(d, NULL, &cl->bstats, true) < 0 || + gnet_stats_copy_rate_est(d, &cl->rate_est) < 0 || + gnet_stats_copy_queue(d, NULL, &cl->qstats, qlen) < 0) + return -1; +@@ -1518,7 +1517,7 @@ cbq_change_class(struct Qdisc *sch, u32 + err = gen_replace_estimator(&cl->bstats, NULL, + &cl->rate_est, + NULL, +- qdisc_root_sleeping_running(sch), ++ true, + tca[TCA_RATE]); + if (err) { + NL_SET_ERR_MSG(extack, "Failed to replace specified rate estimator"); +@@ -1619,9 +1618,7 @@ cbq_change_class(struct Qdisc *sch, u32 + + if (tca[TCA_RATE]) { + err = gen_new_estimator(&cl->bstats, NULL, &cl->rate_est, +- NULL, +- qdisc_root_sleeping_running(sch), +- tca[TCA_RATE]); ++ NULL, true, tca[TCA_RATE]); + if (err) { + NL_SET_ERR_MSG(extack, "Couldn't create new estimator"); + tcf_block_put(cl->block); +--- a/net/sched/sch_drr.c ++++ b/net/sched/sch_drr.c +@@ -85,8 +85,7 @@ static int drr_change_class(struct Qdisc + if (tca[TCA_RATE]) { + err = gen_replace_estimator(&cl->bstats, NULL, + &cl->rate_est, +- NULL, +- qdisc_root_sleeping_running(sch), ++ NULL, true, + tca[TCA_RATE]); + if (err) { + NL_SET_ERR_MSG(extack, "Failed to replace estimator"); +@@ -119,9 +118,7 @@ static int drr_change_class(struct Qdisc + + if (tca[TCA_RATE]) { + err = gen_replace_estimator(&cl->bstats, NULL, &cl->rate_est, +- NULL, +- qdisc_root_sleeping_running(sch), +- tca[TCA_RATE]); ++ NULL, true, tca[TCA_RATE]); + if (err) { + NL_SET_ERR_MSG(extack, "Failed to replace estimator"); + qdisc_put(cl->qdisc); +@@ -268,8 +265,7 @@ static int drr_dump_class_stats(struct Q + if (qlen) + xstats.deficit = cl->deficit; + +- if (gnet_stats_copy_basic(qdisc_root_sleeping_running(sch), +- d, NULL, &cl->bstats) < 0 || ++ if (gnet_stats_copy_basic(d, NULL, &cl->bstats, true) < 0 || + gnet_stats_copy_rate_est(d, &cl->rate_est) < 0 || + gnet_stats_copy_queue(d, cl_q->cpu_qstats, &cl_q->qstats, qlen) < 0) + return -1; +--- a/net/sched/sch_ets.c ++++ b/net/sched/sch_ets.c +@@ -325,8 +325,7 @@ static int ets_class_dump_stats(struct Q + struct ets_class *cl = ets_class_from_arg(sch, arg); + struct Qdisc *cl_q = cl->qdisc; + +- if (gnet_stats_copy_basic(qdisc_root_sleeping_running(sch), +- d, NULL, &cl_q->bstats) < 0 || ++ if (gnet_stats_copy_basic(d, NULL, &cl_q->bstats, true) < 0 || + qdisc_qstats_copy(d, cl_q) < 0) + return -1; + +--- a/net/sched/sch_generic.c ++++ b/net/sched/sch_generic.c +@@ -304,8 +304,8 @@ static struct sk_buff *dequeue_skb(struc + + /* + * Transmit possibly several skbs, and handle the return status as +- * required. Owning running seqcount bit guarantees that +- * only one CPU can execute this function. ++ * required. Owning qdisc running bit guarantees that only one CPU ++ * can execute this function. + * + * Returns to the caller: + * false - hardware queue frozen backoff +@@ -606,7 +606,6 @@ struct Qdisc noop_qdisc = { + .ops = &noop_qdisc_ops, + .q.lock = __SPIN_LOCK_UNLOCKED(noop_qdisc.q.lock), + .dev_queue = &noop_netdev_queue, +- .running = SEQCNT_ZERO(noop_qdisc.running), + .busylock = __SPIN_LOCK_UNLOCKED(noop_qdisc.busylock), + .gso_skb = { + .next = (struct sk_buff *)&noop_qdisc.gso_skb, +@@ -867,7 +866,6 @@ struct Qdisc_ops pfifo_fast_ops __read_m + EXPORT_SYMBOL(pfifo_fast_ops); + + static struct lock_class_key qdisc_tx_busylock; +-static struct lock_class_key qdisc_running_key; + + struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue, + const struct Qdisc_ops *ops, +@@ -917,10 +915,6 @@ struct Qdisc *qdisc_alloc(struct netdev_ + lockdep_set_class(&sch->seqlock, + dev->qdisc_tx_busylock ?: &qdisc_tx_busylock); + +- seqcount_init(&sch->running); +- lockdep_set_class(&sch->running, +- dev->qdisc_running_key ?: &qdisc_running_key); +- + sch->ops = ops; + sch->flags = ops->static_flags; + sch->enqueue = ops->enqueue; +--- a/net/sched/sch_hfsc.c ++++ b/net/sched/sch_hfsc.c +@@ -965,7 +965,7 @@ hfsc_change_class(struct Qdisc *sch, u32 + err = gen_replace_estimator(&cl->bstats, NULL, + &cl->rate_est, + NULL, +- qdisc_root_sleeping_running(sch), ++ true, + tca[TCA_RATE]); + if (err) + return err; +@@ -1033,9 +1033,7 @@ hfsc_change_class(struct Qdisc *sch, u32 + + if (tca[TCA_RATE]) { + err = gen_new_estimator(&cl->bstats, NULL, &cl->rate_est, +- NULL, +- qdisc_root_sleeping_running(sch), +- tca[TCA_RATE]); ++ NULL, true, tca[TCA_RATE]); + if (err) { + tcf_block_put(cl->block); + kfree(cl); +@@ -1328,7 +1326,7 @@ hfsc_dump_class_stats(struct Qdisc *sch, + xstats.work = cl->cl_total; + xstats.rtwork = cl->cl_cumul; + +- if (gnet_stats_copy_basic(qdisc_root_sleeping_running(sch), d, NULL, &cl->bstats) < 0 || ++ if (gnet_stats_copy_basic(d, NULL, &cl->bstats, true) < 0 || + gnet_stats_copy_rate_est(d, &cl->rate_est) < 0 || + gnet_stats_copy_queue(d, NULL, &cl->qstats, qlen) < 0) + return -1; +--- a/net/sched/sch_htb.c ++++ b/net/sched/sch_htb.c +@@ -1368,8 +1368,7 @@ htb_dump_class_stats(struct Qdisc *sch, + } + } + +- if (gnet_stats_copy_basic(qdisc_root_sleeping_running(sch), +- d, NULL, &cl->bstats) < 0 || ++ if (gnet_stats_copy_basic(d, NULL, &cl->bstats, true) < 0 || + gnet_stats_copy_rate_est(d, &cl->rate_est) < 0 || + gnet_stats_copy_queue(d, NULL, &qs, qlen) < 0) + return -1; +@@ -1865,7 +1864,7 @@ static int htb_change_class(struct Qdisc + err = gen_new_estimator(&cl->bstats, NULL, + &cl->rate_est, + NULL, +- qdisc_root_sleeping_running(sch), ++ true, + tca[TCA_RATE] ? : &est.nla); + if (err) + goto err_block_put; +@@ -1991,7 +1990,7 @@ static int htb_change_class(struct Qdisc + err = gen_replace_estimator(&cl->bstats, NULL, + &cl->rate_est, + NULL, +- qdisc_root_sleeping_running(sch), ++ true, + tca[TCA_RATE]); + if (err) + return err; +--- a/net/sched/sch_mq.c ++++ b/net/sched/sch_mq.c +@@ -144,8 +144,8 @@ static int mq_dump(struct Qdisc *sch, st + qdisc = netdev_get_tx_queue(dev, ntx)->qdisc_sleeping; + spin_lock_bh(qdisc_lock(qdisc)); + +- gnet_stats_add_basic(NULL, &sch->bstats, qdisc->cpu_bstats, +- &qdisc->bstats); ++ gnet_stats_add_basic(&sch->bstats, qdisc->cpu_bstats, ++ &qdisc->bstats, false); + gnet_stats_add_queue(&sch->qstats, qdisc->cpu_qstats, + &qdisc->qstats); + sch->q.qlen += qdisc_qlen(qdisc); +@@ -231,8 +231,7 @@ static int mq_dump_class_stats(struct Qd + struct netdev_queue *dev_queue = mq_queue_get(sch, cl); + + sch = dev_queue->qdisc_sleeping; +- if (gnet_stats_copy_basic(&sch->running, d, sch->cpu_bstats, +- &sch->bstats) < 0 || ++ if (gnet_stats_copy_basic(d, sch->cpu_bstats, &sch->bstats, true) < 0 || + qdisc_qstats_copy(d, sch) < 0) + return -1; + return 0; +--- a/net/sched/sch_mqprio.c ++++ b/net/sched/sch_mqprio.c +@@ -402,8 +402,8 @@ static int mqprio_dump(struct Qdisc *sch + qdisc = netdev_get_tx_queue(dev, ntx)->qdisc_sleeping; + spin_lock_bh(qdisc_lock(qdisc)); + +- gnet_stats_add_basic(NULL, &sch->bstats, qdisc->cpu_bstats, +- &qdisc->bstats); ++ gnet_stats_add_basic(&sch->bstats, qdisc->cpu_bstats, ++ &qdisc->bstats, false); + gnet_stats_add_queue(&sch->qstats, qdisc->cpu_qstats, + &qdisc->qstats); + sch->q.qlen += qdisc_qlen(qdisc); +@@ -519,8 +519,8 @@ static int mqprio_dump_class_stats(struc + + spin_lock_bh(qdisc_lock(qdisc)); + +- gnet_stats_add_basic(NULL, &bstats, qdisc->cpu_bstats, +- &qdisc->bstats); ++ gnet_stats_add_basic(&bstats, qdisc->cpu_bstats, ++ &qdisc->bstats, false); + gnet_stats_add_queue(&qstats, qdisc->cpu_qstats, + &qdisc->qstats); + sch->q.qlen += qdisc_qlen(qdisc); +@@ -532,15 +532,15 @@ static int mqprio_dump_class_stats(struc + /* Reclaim root sleeping lock before completing stats */ + if (d->lock) + spin_lock_bh(d->lock); +- if (gnet_stats_copy_basic(NULL, d, NULL, &bstats) < 0 || ++ if (gnet_stats_copy_basic(d, NULL, &bstats, false) < 0 || + gnet_stats_copy_queue(d, NULL, &qstats, qlen) < 0) + return -1; + } else { + struct netdev_queue *dev_queue = mqprio_queue_get(sch, cl); + + sch = dev_queue->qdisc_sleeping; +- if (gnet_stats_copy_basic(qdisc_root_sleeping_running(sch), d, +- sch->cpu_bstats, &sch->bstats) < 0 || ++ if (gnet_stats_copy_basic(d, sch->cpu_bstats, ++ &sch->bstats, true) < 0 || + qdisc_qstats_copy(d, sch) < 0) + return -1; + } +--- a/net/sched/sch_multiq.c ++++ b/net/sched/sch_multiq.c +@@ -338,8 +338,7 @@ static int multiq_dump_class_stats(struc + struct Qdisc *cl_q; + + cl_q = q->queues[cl - 1]; +- if (gnet_stats_copy_basic(qdisc_root_sleeping_running(sch), +- d, cl_q->cpu_bstats, &cl_q->bstats) < 0 || ++ if (gnet_stats_copy_basic(d, cl_q->cpu_bstats, &cl_q->bstats, true) < 0 || + qdisc_qstats_copy(d, cl_q) < 0) + return -1; + +--- a/net/sched/sch_prio.c ++++ b/net/sched/sch_prio.c +@@ -361,8 +361,8 @@ static int prio_dump_class_stats(struct + struct Qdisc *cl_q; + + cl_q = q->queues[cl - 1]; +- if (gnet_stats_copy_basic(qdisc_root_sleeping_running(sch), +- d, cl_q->cpu_bstats, &cl_q->bstats) < 0 || ++ if (gnet_stats_copy_basic(d, cl_q->cpu_bstats, ++ &cl_q->bstats, true) < 0 || + qdisc_qstats_copy(d, cl_q) < 0) + return -1; + +--- a/net/sched/sch_qfq.c ++++ b/net/sched/sch_qfq.c +@@ -451,7 +451,7 @@ static int qfq_change_class(struct Qdisc + err = gen_replace_estimator(&cl->bstats, NULL, + &cl->rate_est, + NULL, +- qdisc_root_sleeping_running(sch), ++ true, + tca[TCA_RATE]); + if (err) + return err; +@@ -478,7 +478,7 @@ static int qfq_change_class(struct Qdisc + err = gen_new_estimator(&cl->bstats, NULL, + &cl->rate_est, + NULL, +- qdisc_root_sleeping_running(sch), ++ true, + tca[TCA_RATE]); + if (err) + goto destroy_class; +@@ -640,8 +640,7 @@ static int qfq_dump_class_stats(struct Q + xstats.weight = cl->agg->class_weight; + xstats.lmax = cl->agg->lmax; + +- if (gnet_stats_copy_basic(qdisc_root_sleeping_running(sch), +- d, NULL, &cl->bstats) < 0 || ++ if (gnet_stats_copy_basic(d, NULL, &cl->bstats, true) < 0 || + gnet_stats_copy_rate_est(d, &cl->rate_est) < 0 || + qdisc_qstats_copy(d, cl->qdisc) < 0) + return -1; +--- a/net/sched/sch_taprio.c ++++ b/net/sched/sch_taprio.c +@@ -1977,7 +1977,7 @@ static int taprio_dump_class_stats(struc + struct netdev_queue *dev_queue = taprio_queue_get(sch, cl); + + sch = dev_queue->qdisc_sleeping; +- if (gnet_stats_copy_basic(&sch->running, d, NULL, &sch->bstats) < 0 || ++ if (gnet_stats_copy_basic(d, NULL, &sch->bstats, true) < 0 || + qdisc_qstats_copy(d, sch) < 0) + return -1; + return 0; diff --git a/debian/patches-rt/0009-sched-Massage-set_cpus_allowed.patch b/debian/patches-rt/0009-sched-Massage-set_cpus_allowed.patch deleted file mode 100644 index 2886cb4ec..000000000 --- a/debian/patches-rt/0009-sched-Massage-set_cpus_allowed.patch +++ /dev/null @@ -1,175 +0,0 @@ -From 434caa1bbb1880ba459451e26923fce47fe637c5 Mon Sep 17 00:00:00 2001 -From: Peter Zijlstra <peterz@infradead.org> -Date: Fri, 23 Oct 2020 12:12:06 +0200 -Subject: [PATCH 009/296] sched: Massage set_cpus_allowed() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Thread a u32 flags word through the *set_cpus_allowed*() callchain. -This will allow adding behavioural tweaks for future users. - -Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/sched/core.c | 28 ++++++++++++++++++---------- - kernel/sched/deadline.c | 5 +++-- - kernel/sched/sched.h | 7 +++++-- - 3 files changed, 26 insertions(+), 14 deletions(-) - -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index f7403832b3b1..1323b3f9d40e 100644 ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -1822,13 +1822,14 @@ static int migration_cpu_stop(void *data) - * sched_class::set_cpus_allowed must do the below, but is not required to - * actually call this function. - */ --void set_cpus_allowed_common(struct task_struct *p, const struct cpumask *new_mask) -+void set_cpus_allowed_common(struct task_struct *p, const struct cpumask *new_mask, u32 flags) - { - cpumask_copy(&p->cpus_mask, new_mask); - p->nr_cpus_allowed = cpumask_weight(new_mask); - } - --void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask) -+static void -+__do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask, u32 flags) - { - struct rq *rq = task_rq(p); - bool queued, running; -@@ -1849,7 +1850,7 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask) - if (running) - put_prev_task(rq, p); - -- p->sched_class->set_cpus_allowed(p, new_mask); -+ p->sched_class->set_cpus_allowed(p, new_mask, flags); - - if (queued) - enqueue_task(rq, p, ENQUEUE_RESTORE | ENQUEUE_NOCLOCK); -@@ -1857,6 +1858,11 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask) - set_next_task(rq, p); - } - -+void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask) -+{ -+ __do_set_cpus_allowed(p, new_mask, 0); -+} -+ - /* - * Change a given task's CPU affinity. Migrate the thread to a - * proper CPU and schedule it away if the CPU it's executing on -@@ -1867,7 +1873,8 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask) - * call is not atomic; no spinlocks may be held. - */ - static int __set_cpus_allowed_ptr(struct task_struct *p, -- const struct cpumask *new_mask, bool check) -+ const struct cpumask *new_mask, -+ u32 flags) - { - const struct cpumask *cpu_valid_mask = cpu_active_mask; - unsigned int dest_cpu; -@@ -1889,7 +1896,7 @@ static int __set_cpus_allowed_ptr(struct task_struct *p, - * Must re-check here, to close a race against __kthread_bind(), - * sched_setaffinity() is not guaranteed to observe the flag. - */ -- if (check && (p->flags & PF_NO_SETAFFINITY)) { -+ if ((flags & SCA_CHECK) && (p->flags & PF_NO_SETAFFINITY)) { - ret = -EINVAL; - goto out; - } -@@ -1908,7 +1915,7 @@ static int __set_cpus_allowed_ptr(struct task_struct *p, - goto out; - } - -- do_set_cpus_allowed(p, new_mask); -+ __do_set_cpus_allowed(p, new_mask, flags); - - if (p->flags & PF_KTHREAD) { - /* -@@ -1945,7 +1952,7 @@ static int __set_cpus_allowed_ptr(struct task_struct *p, - - int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask) - { -- return __set_cpus_allowed_ptr(p, new_mask, false); -+ return __set_cpus_allowed_ptr(p, new_mask, 0); - } - EXPORT_SYMBOL_GPL(set_cpus_allowed_ptr); - -@@ -2404,7 +2411,8 @@ void sched_set_stop_task(int cpu, struct task_struct *stop) - #else - - static inline int __set_cpus_allowed_ptr(struct task_struct *p, -- const struct cpumask *new_mask, bool check) -+ const struct cpumask *new_mask, -+ u32 flags) - { - return set_cpus_allowed_ptr(p, new_mask); - } -@@ -6009,7 +6017,7 @@ long sched_setaffinity(pid_t pid, const struct cpumask *in_mask) - } - #endif - again: -- retval = __set_cpus_allowed_ptr(p, new_mask, true); -+ retval = __set_cpus_allowed_ptr(p, new_mask, SCA_CHECK); - - if (!retval) { - cpuset_cpus_allowed(p, cpus_allowed); -@@ -6588,7 +6596,7 @@ void init_idle(struct task_struct *idle, int cpu) - * - * And since this is boot we can forgo the serialization. - */ -- set_cpus_allowed_common(idle, cpumask_of(cpu)); -+ set_cpus_allowed_common(idle, cpumask_of(cpu), 0); - #endif - /* - * We're having a chicken and egg problem, even though we are -diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c -index e0564abfece1..9495a26f7e73 100644 ---- a/kernel/sched/deadline.c -+++ b/kernel/sched/deadline.c -@@ -2307,7 +2307,8 @@ static void task_woken_dl(struct rq *rq, struct task_struct *p) - } - - static void set_cpus_allowed_dl(struct task_struct *p, -- const struct cpumask *new_mask) -+ const struct cpumask *new_mask, -+ u32 flags) - { - struct root_domain *src_rd; - struct rq *rq; -@@ -2336,7 +2337,7 @@ static void set_cpus_allowed_dl(struct task_struct *p, - raw_spin_unlock(&src_dl_b->lock); - } - -- set_cpus_allowed_common(p, new_mask); -+ set_cpus_allowed_common(p, new_mask, flags); - } - - /* Assumes rq->lock is held */ -diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h -index 7dcd5ac722ec..07991844697c 100644 ---- a/kernel/sched/sched.h -+++ b/kernel/sched/sched.h -@@ -1809,7 +1809,8 @@ struct sched_class { - void (*task_woken)(struct rq *this_rq, struct task_struct *task); - - void (*set_cpus_allowed)(struct task_struct *p, -- const struct cpumask *newmask); -+ const struct cpumask *newmask, -+ u32 flags); - - void (*rq_online)(struct rq *rq); - void (*rq_offline)(struct rq *rq); -@@ -1902,7 +1903,9 @@ extern void update_group_capacity(struct sched_domain *sd, int cpu); - - extern void trigger_load_balance(struct rq *rq); - --extern void set_cpus_allowed_common(struct task_struct *p, const struct cpumask *new_mask); -+#define SCA_CHECK 0x01 -+ -+extern void set_cpus_allowed_common(struct task_struct *p, const struct cpumask *new_mask, u32 flags); - - #endif - --- -2.30.2 - diff --git a/debian/patches-rt/0010-drm-i915-Drop-the-irqs_disabled-check.patch b/debian/patches-rt/0010-drm-i915-Drop-the-irqs_disabled-check.patch new file mode 100644 index 000000000..711f1de53 --- /dev/null +++ b/debian/patches-rt/0010-drm-i915-Drop-the-irqs_disabled-check.patch @@ -0,0 +1,39 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Fri, 1 Oct 2021 20:01:03 +0200 +Subject: [PATCH 10/10] drm/i915: Drop the irqs_disabled() check +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The !irqs_disabled() check triggers on PREEMPT_RT even with +i915_sched_engine::lock acquired. The reason is the lock is transformed +into a sleeping lock on PREEMPT_RT and does not disable interrupts. + +There is no need to check for disabled interrupts. The lockdep +annotation below already check if the lock has been acquired by the +caller and will yell if the interrupts are not disabled. + +Remove the !irqs_disabled() check. + +Reported-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + drivers/gpu/drm/i915/i915_request.c | 2 -- + 1 file changed, 2 deletions(-) + +--- a/drivers/gpu/drm/i915/i915_request.c ++++ b/drivers/gpu/drm/i915/i915_request.c +@@ -559,7 +559,6 @@ bool __i915_request_submit(struct i915_r + + RQ_TRACE(request, "\n"); + +- GEM_BUG_ON(!irqs_disabled()); + lockdep_assert_held(&engine->sched_engine->lock); + + /* +@@ -668,7 +667,6 @@ void __i915_request_unsubmit(struct i915 + */ + RQ_TRACE(request, "\n"); + +- GEM_BUG_ON(!irqs_disabled()); + lockdep_assert_held(&engine->sched_engine->lock); + + /* diff --git a/debian/patches-rt/0010-lockdep-selftests-Adapt-ww-tests-for-PREEMPT_RT.patch b/debian/patches-rt/0010-lockdep-selftests-Adapt-ww-tests-for-PREEMPT_RT.patch new file mode 100644 index 000000000..7ad99f0ee --- /dev/null +++ b/debian/patches-rt/0010-lockdep-selftests-Adapt-ww-tests-for-PREEMPT_RT.patch @@ -0,0 +1,252 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Thu, 12 Aug 2021 18:13:39 +0200 +Subject: [PATCH 10/10] lockdep/selftests: Adapt ww-tests for PREEMPT_RT +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The ww-mutex selftest operates directly on ww_mutex::base and assumes +its type is struct mutex. This isn't true on PREEMPT_RT which turns the +mutex into a rtmutex. + +Add a ww_mutex_base_ abstraction which maps to the relevant mutex_ or +rt_mutex_ function. +Change the CONFIG_DEBUG_MUTEXES ifdef to DEBUG_WW_MUTEXES. The latter is +true for the MUTEX and RTMUTEX implementation of WW-MUTEX. The +assignment is required in order to pass the tests. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + lib/locking-selftest.c | 74 +++++++++++++++++++++++++++++-------------------- + 1 file changed, 44 insertions(+), 30 deletions(-) + +--- a/lib/locking-selftest.c ++++ b/lib/locking-selftest.c +@@ -1700,6 +1700,20 @@ static void ww_test_fail_acquire(void) + #endif + } + ++#ifdef CONFIG_PREEMPT_RT ++#define ww_mutex_base_lock(b) rt_mutex_lock(b) ++#define ww_mutex_base_lock_nest_lock(b, b2) rt_mutex_lock_nest_lock(b, b2) ++#define ww_mutex_base_lock_interruptible(b) rt_mutex_lock_interruptible(b) ++#define ww_mutex_base_lock_killable(b) rt_mutex_lock_killable(b) ++#define ww_mutex_base_unlock(b) rt_mutex_unlock(b) ++#else ++#define ww_mutex_base_lock(b) mutex_lock(b) ++#define ww_mutex_base_lock_nest_lock(b, b2) mutex_lock_nest_lock(b, b2) ++#define ww_mutex_base_lock_interruptible(b) mutex_lock_interruptible(b) ++#define ww_mutex_base_lock_killable(b) mutex_lock_killable(b) ++#define ww_mutex_base_unlock(b) mutex_unlock(b) ++#endif ++ + static void ww_test_normal(void) + { + int ret; +@@ -1714,50 +1728,50 @@ static void ww_test_normal(void) + + /* mutex_lock (and indirectly, mutex_lock_nested) */ + o.ctx = (void *)~0UL; +- mutex_lock(&o.base); +- mutex_unlock(&o.base); ++ ww_mutex_base_lock(&o.base); ++ ww_mutex_base_unlock(&o.base); + WARN_ON(o.ctx != (void *)~0UL); + + /* mutex_lock_interruptible (and *_nested) */ + o.ctx = (void *)~0UL; +- ret = mutex_lock_interruptible(&o.base); ++ ret = ww_mutex_base_lock_interruptible(&o.base); + if (!ret) +- mutex_unlock(&o.base); ++ ww_mutex_base_unlock(&o.base); + else + WARN_ON(1); + WARN_ON(o.ctx != (void *)~0UL); + + /* mutex_lock_killable (and *_nested) */ + o.ctx = (void *)~0UL; +- ret = mutex_lock_killable(&o.base); ++ ret = ww_mutex_base_lock_killable(&o.base); + if (!ret) +- mutex_unlock(&o.base); ++ ww_mutex_base_unlock(&o.base); + else + WARN_ON(1); + WARN_ON(o.ctx != (void *)~0UL); + + /* trylock, succeeding */ + o.ctx = (void *)~0UL; +- ret = mutex_trylock(&o.base); ++ ret = ww_mutex_base_trylock(&o.base); + WARN_ON(!ret); + if (ret) +- mutex_unlock(&o.base); ++ ww_mutex_base_unlock(&o.base); + else + WARN_ON(1); + WARN_ON(o.ctx != (void *)~0UL); + + /* trylock, failing */ + o.ctx = (void *)~0UL; +- mutex_lock(&o.base); +- ret = mutex_trylock(&o.base); ++ ww_mutex_base_lock(&o.base); ++ ret = ww_mutex_base_trylock(&o.base); + WARN_ON(ret); +- mutex_unlock(&o.base); ++ ww_mutex_base_unlock(&o.base); + WARN_ON(o.ctx != (void *)~0UL); + + /* nest_lock */ + o.ctx = (void *)~0UL; +- mutex_lock_nest_lock(&o.base, &t); +- mutex_unlock(&o.base); ++ ww_mutex_base_lock_nest_lock(&o.base, &t); ++ ww_mutex_base_unlock(&o.base); + WARN_ON(o.ctx != (void *)~0UL); + } + +@@ -1770,7 +1784,7 @@ static void ww_test_two_contexts(void) + static void ww_test_diff_class(void) + { + WWAI(&t); +-#ifdef CONFIG_DEBUG_MUTEXES ++#ifdef DEBUG_WW_MUTEXES + t.ww_class = NULL; + #endif + WWL(&o, &t); +@@ -1834,7 +1848,7 @@ static void ww_test_edeadlk_normal(void) + { + int ret; + +- mutex_lock(&o2.base); ++ ww_mutex_base_lock(&o2.base); + o2.ctx = &t2; + mutex_release(&o2.base.dep_map, _THIS_IP_); + +@@ -1850,7 +1864,7 @@ static void ww_test_edeadlk_normal(void) + + o2.ctx = NULL; + mutex_acquire(&o2.base.dep_map, 0, 1, _THIS_IP_); +- mutex_unlock(&o2.base); ++ ww_mutex_base_unlock(&o2.base); + WWU(&o); + + WWL(&o2, &t); +@@ -1860,7 +1874,7 @@ static void ww_test_edeadlk_normal_slow( + { + int ret; + +- mutex_lock(&o2.base); ++ ww_mutex_base_lock(&o2.base); + mutex_release(&o2.base.dep_map, _THIS_IP_); + o2.ctx = &t2; + +@@ -1876,7 +1890,7 @@ static void ww_test_edeadlk_normal_slow( + + o2.ctx = NULL; + mutex_acquire(&o2.base.dep_map, 0, 1, _THIS_IP_); +- mutex_unlock(&o2.base); ++ ww_mutex_base_unlock(&o2.base); + WWU(&o); + + ww_mutex_lock_slow(&o2, &t); +@@ -1886,7 +1900,7 @@ static void ww_test_edeadlk_no_unlock(vo + { + int ret; + +- mutex_lock(&o2.base); ++ ww_mutex_base_lock(&o2.base); + o2.ctx = &t2; + mutex_release(&o2.base.dep_map, _THIS_IP_); + +@@ -1902,7 +1916,7 @@ static void ww_test_edeadlk_no_unlock(vo + + o2.ctx = NULL; + mutex_acquire(&o2.base.dep_map, 0, 1, _THIS_IP_); +- mutex_unlock(&o2.base); ++ ww_mutex_base_unlock(&o2.base); + + WWL(&o2, &t); + } +@@ -1911,7 +1925,7 @@ static void ww_test_edeadlk_no_unlock_sl + { + int ret; + +- mutex_lock(&o2.base); ++ ww_mutex_base_lock(&o2.base); + mutex_release(&o2.base.dep_map, _THIS_IP_); + o2.ctx = &t2; + +@@ -1927,7 +1941,7 @@ static void ww_test_edeadlk_no_unlock_sl + + o2.ctx = NULL; + mutex_acquire(&o2.base.dep_map, 0, 1, _THIS_IP_); +- mutex_unlock(&o2.base); ++ ww_mutex_base_unlock(&o2.base); + + ww_mutex_lock_slow(&o2, &t); + } +@@ -1936,7 +1950,7 @@ static void ww_test_edeadlk_acquire_more + { + int ret; + +- mutex_lock(&o2.base); ++ ww_mutex_base_lock(&o2.base); + mutex_release(&o2.base.dep_map, _THIS_IP_); + o2.ctx = &t2; + +@@ -1957,7 +1971,7 @@ static void ww_test_edeadlk_acquire_more + { + int ret; + +- mutex_lock(&o2.base); ++ ww_mutex_base_lock(&o2.base); + mutex_release(&o2.base.dep_map, _THIS_IP_); + o2.ctx = &t2; + +@@ -1978,11 +1992,11 @@ static void ww_test_edeadlk_acquire_more + { + int ret; + +- mutex_lock(&o2.base); ++ ww_mutex_base_lock(&o2.base); + mutex_release(&o2.base.dep_map, _THIS_IP_); + o2.ctx = &t2; + +- mutex_lock(&o3.base); ++ ww_mutex_base_lock(&o3.base); + mutex_release(&o3.base.dep_map, _THIS_IP_); + o3.ctx = &t2; + +@@ -2004,11 +2018,11 @@ static void ww_test_edeadlk_acquire_more + { + int ret; + +- mutex_lock(&o2.base); ++ ww_mutex_base_lock(&o2.base); + mutex_release(&o2.base.dep_map, _THIS_IP_); + o2.ctx = &t2; + +- mutex_lock(&o3.base); ++ ww_mutex_base_lock(&o3.base); + mutex_release(&o3.base.dep_map, _THIS_IP_); + o3.ctx = &t2; + +@@ -2029,7 +2043,7 @@ static void ww_test_edeadlk_acquire_wron + { + int ret; + +- mutex_lock(&o2.base); ++ ww_mutex_base_lock(&o2.base); + mutex_release(&o2.base.dep_map, _THIS_IP_); + o2.ctx = &t2; + +@@ -2054,7 +2068,7 @@ static void ww_test_edeadlk_acquire_wron + { + int ret; + +- mutex_lock(&o2.base); ++ ww_mutex_base_lock(&o2.base); + mutex_release(&o2.base.dep_map, _THIS_IP_); + o2.ctx = &t2; + diff --git a/debian/patches-rt/0010-sched-Add-migrate_disable.patch b/debian/patches-rt/0010-sched-Add-migrate_disable.patch deleted file mode 100644 index 49fc2f97b..000000000 --- a/debian/patches-rt/0010-sched-Add-migrate_disable.patch +++ /dev/null @@ -1,356 +0,0 @@ -From cab021aed0c1f72899edd54800322e5190e98a8c Mon Sep 17 00:00:00 2001 -From: Peter Zijlstra <peterz@infradead.org> -Date: Fri, 23 Oct 2020 12:12:07 +0200 -Subject: [PATCH 010/296] sched: Add migrate_disable() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Add the base migrate_disable() support (under protest). - -While migrate_disable() is (currently) required for PREEMPT_RT, it is -also one of the biggest flaws in the system. - -Notably this is just the base implementation, it is broken vs -sched_setaffinity() and hotplug, both solved in additional patches for -ease of review. - -Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/preempt.h | 65 +++++++++++++++++++++++ - include/linux/sched.h | 3 ++ - kernel/sched/core.c | 112 +++++++++++++++++++++++++++++++++++++--- - kernel/sched/sched.h | 6 ++- - lib/smp_processor_id.c | 5 ++ - 5 files changed, 183 insertions(+), 8 deletions(-) - -diff --git a/include/linux/preempt.h b/include/linux/preempt.h -index 7d9c1c0e149c..97ba7c920653 100644 ---- a/include/linux/preempt.h -+++ b/include/linux/preempt.h -@@ -322,6 +322,69 @@ static inline void preempt_notifier_init(struct preempt_notifier *notifier, - - #endif - -+#if defined(CONFIG_SMP) && defined(CONFIG_PREEMPT_RT) -+ -+/* -+ * Migrate-Disable and why it is (strongly) undesired. -+ * -+ * The premise of the Real-Time schedulers we have on Linux -+ * (SCHED_FIFO/SCHED_DEADLINE) is that M CPUs can/will run M tasks -+ * concurrently, provided there are sufficient runnable tasks, also known as -+ * work-conserving. For instance SCHED_DEADLINE tries to schedule the M -+ * earliest deadline threads, and SCHED_FIFO the M highest priority threads. -+ * -+ * The correctness of various scheduling models depends on this, but is it -+ * broken by migrate_disable() that doesn't imply preempt_disable(). Where -+ * preempt_disable() implies an immediate priority ceiling, preemptible -+ * migrate_disable() allows nesting. -+ * -+ * The worst case is that all tasks preempt one another in a migrate_disable() -+ * region and stack on a single CPU. This then reduces the available bandwidth -+ * to a single CPU. And since Real-Time schedulability theory considers the -+ * Worst-Case only, all Real-Time analysis shall revert to single-CPU -+ * (instantly solving the SMP analysis problem). -+ * -+ * -+ * The reason we have it anyway. -+ * -+ * PREEMPT_RT breaks a number of assumptions traditionally held. By forcing a -+ * number of primitives into becoming preemptible, they would also allow -+ * migration. This turns out to break a bunch of per-cpu usage. To this end, -+ * all these primitives employ migirate_disable() to restore this implicit -+ * assumption. -+ * -+ * This is a 'temporary' work-around at best. The correct solution is getting -+ * rid of the above assumptions and reworking the code to employ explicit -+ * per-cpu locking or short preempt-disable regions. -+ * -+ * The end goal must be to get rid of migrate_disable(), alternatively we need -+ * a schedulability theory that does not depend on abritrary migration. -+ * -+ * -+ * Notes on the implementation. -+ * -+ * The implementation is particularly tricky since existing code patterns -+ * dictate neither migrate_disable() nor migrate_enable() is allowed to block. -+ * This means that it cannot use cpus_read_lock() to serialize against hotplug, -+ * nor can it easily migrate itself into a pending affinity mask change on -+ * migrate_enable(). -+ * -+ * -+ * Note: even non-work-conserving schedulers like semi-partitioned depends on -+ * migration, so migrate_disable() is not only a problem for -+ * work-conserving schedulers. -+ * -+ */ -+extern void migrate_disable(void); -+extern void migrate_enable(void); -+ -+#elif defined(CONFIG_PREEMPT_RT) -+ -+static inline void migrate_disable(void) { } -+static inline void migrate_enable(void) { } -+ -+#else /* !CONFIG_PREEMPT_RT */ -+ - /** - * migrate_disable - Prevent migration of the current task - * -@@ -352,4 +415,6 @@ static __always_inline void migrate_enable(void) - preempt_enable(); - } - -+#endif /* CONFIG_SMP && CONFIG_PREEMPT_RT */ -+ - #endif /* __LINUX_PREEMPT_H */ -diff --git a/include/linux/sched.h b/include/linux/sched.h -index 76cd21fa5501..8efed5834170 100644 ---- a/include/linux/sched.h -+++ b/include/linux/sched.h -@@ -722,6 +722,9 @@ struct task_struct { - int nr_cpus_allowed; - const cpumask_t *cpus_ptr; - cpumask_t cpus_mask; -+#if defined(CONFIG_SMP) && defined(CONFIG_PREEMPT_RT) -+ int migration_disabled; -+#endif - - #ifdef CONFIG_PREEMPT_RCU - int rcu_read_lock_nesting; -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index 1323b3f9d40e..583b1312f5c6 100644 ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -1694,6 +1694,61 @@ void check_preempt_curr(struct rq *rq, struct task_struct *p, int flags) - - #ifdef CONFIG_SMP - -+#ifdef CONFIG_PREEMPT_RT -+ -+static void -+__do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask, u32 flags); -+ -+static int __set_cpus_allowed_ptr(struct task_struct *p, -+ const struct cpumask *new_mask, -+ u32 flags); -+ -+static void migrate_disable_switch(struct rq *rq, struct task_struct *p) -+{ -+ if (likely(!p->migration_disabled)) -+ return; -+ -+ if (p->cpus_ptr != &p->cpus_mask) -+ return; -+ -+ /* -+ * Violates locking rules! see comment in __do_set_cpus_allowed(). -+ */ -+ __do_set_cpus_allowed(p, cpumask_of(rq->cpu), SCA_MIGRATE_DISABLE); -+} -+ -+void migrate_disable(void) -+{ -+ if (current->migration_disabled++) -+ return; -+ -+ barrier(); -+} -+EXPORT_SYMBOL_GPL(migrate_disable); -+ -+void migrate_enable(void) -+{ -+ struct task_struct *p = current; -+ -+ if (--p->migration_disabled) -+ return; -+ -+ barrier(); -+ -+ if (p->cpus_ptr == &p->cpus_mask) -+ return; -+ -+ __set_cpus_allowed_ptr(p, &p->cpus_mask, SCA_MIGRATE_ENABLE); -+} -+EXPORT_SYMBOL_GPL(migrate_enable); -+ -+static inline bool is_migration_disabled(struct task_struct *p) -+{ -+ return p->migration_disabled; -+} -+ -+#endif -+ - /* - * Per-CPU kthreads are allowed to run on !active && online CPUs, see - * __set_cpus_allowed_ptr() and select_fallback_rq(). -@@ -1703,7 +1758,7 @@ static inline bool is_cpu_allowed(struct task_struct *p, int cpu) - if (!cpumask_test_cpu(cpu, p->cpus_ptr)) - return false; - -- if (is_per_cpu_kthread(p)) -+ if (is_per_cpu_kthread(p) || is_migration_disabled(p)) - return cpu_online(cpu); - - return cpu_active(cpu); -@@ -1824,6 +1879,11 @@ static int migration_cpu_stop(void *data) - */ - void set_cpus_allowed_common(struct task_struct *p, const struct cpumask *new_mask, u32 flags) - { -+ if (flags & (SCA_MIGRATE_ENABLE | SCA_MIGRATE_DISABLE)) { -+ p->cpus_ptr = new_mask; -+ return; -+ } -+ - cpumask_copy(&p->cpus_mask, new_mask); - p->nr_cpus_allowed = cpumask_weight(new_mask); - } -@@ -1834,7 +1894,22 @@ __do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask, u32 - struct rq *rq = task_rq(p); - bool queued, running; - -- lockdep_assert_held(&p->pi_lock); -+ /* -+ * This here violates the locking rules for affinity, since we're only -+ * supposed to change these variables while holding both rq->lock and -+ * p->pi_lock. -+ * -+ * HOWEVER, it magically works, because ttwu() is the only code that -+ * accesses these variables under p->pi_lock and only does so after -+ * smp_cond_load_acquire(&p->on_cpu, !VAL), and we're in __schedule() -+ * before finish_task(). -+ * -+ * XXX do further audits, this smells like something putrid. -+ */ -+ if (flags & SCA_MIGRATE_DISABLE) -+ SCHED_WARN_ON(!p->on_cpu); -+ else -+ lockdep_assert_held(&p->pi_lock); - - queued = task_on_rq_queued(p); - running = task_current(rq, p); -@@ -1885,9 +1960,14 @@ static int __set_cpus_allowed_ptr(struct task_struct *p, - rq = task_rq_lock(p, &rf); - update_rq_clock(rq); - -- if (p->flags & PF_KTHREAD) { -+ if (p->flags & PF_KTHREAD || is_migration_disabled(p)) { - /* -- * Kernel threads are allowed on online && !active CPUs -+ * Kernel threads are allowed on online && !active CPUs. -+ * -+ * Specifically, migration_disabled() tasks must not fail the -+ * cpumask_any_and_distribute() pick below, esp. so on -+ * SCA_MIGRATE_ENABLE, otherwise we'll not call -+ * set_cpus_allowed_common() and actually reset p->cpus_ptr. - */ - cpu_valid_mask = cpu_online_mask; - } -@@ -1901,7 +1981,7 @@ static int __set_cpus_allowed_ptr(struct task_struct *p, - goto out; - } - -- if (cpumask_equal(&p->cpus_mask, new_mask)) -+ if (!(flags & SCA_MIGRATE_ENABLE) && cpumask_equal(&p->cpus_mask, new_mask)) - goto out; - - /* -@@ -1993,6 +2073,8 @@ void set_task_cpu(struct task_struct *p, unsigned int new_cpu) - * Clearly, migrating tasks to offline CPUs is a fairly daft thing. - */ - WARN_ON_ONCE(!cpu_online(new_cpu)); -+ -+ WARN_ON_ONCE(is_migration_disabled(p)); - #endif - - trace_sched_migrate_task(p, new_cpu); -@@ -2323,6 +2405,12 @@ static int select_fallback_rq(int cpu, struct task_struct *p) - } - fallthrough; - case possible: -+ /* -+ * XXX When called from select_task_rq() we only -+ * hold p->pi_lock and again violate locking order. -+ * -+ * More yuck to audit. -+ */ - do_set_cpus_allowed(p, cpu_possible_mask); - state = fail; - break; -@@ -2357,7 +2445,7 @@ int select_task_rq(struct task_struct *p, int cpu, int sd_flags, int wake_flags) - { - lockdep_assert_held(&p->pi_lock); - -- if (p->nr_cpus_allowed > 1) -+ if (p->nr_cpus_allowed > 1 && !is_migration_disabled(p)) - cpu = p->sched_class->select_task_rq(p, cpu, sd_flags, wake_flags); - else - cpu = cpumask_any(p->cpus_ptr); -@@ -2419,6 +2507,17 @@ static inline int __set_cpus_allowed_ptr(struct task_struct *p, - - #endif /* CONFIG_SMP */ - -+#if !defined(CONFIG_SMP) || !defined(CONFIG_PREEMPT_RT) -+ -+static inline void migrate_disable_switch(struct rq *rq, struct task_struct *p) { } -+ -+static inline bool is_migration_disabled(struct task_struct *p) -+{ -+ return false; -+} -+ -+#endif -+ - static void - ttwu_stat(struct task_struct *p, int cpu, int wake_flags) - { -@@ -4572,6 +4671,7 @@ static void __sched notrace __schedule(bool preempt) - */ - ++*switch_count; - -+ migrate_disable_switch(rq, prev); - psi_sched_switch(prev, next, !task_on_rq_queued(prev)); - - trace_sched_switch(preempt, prev, next); -diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h -index 07991844697c..d2fc1db9d3c7 100644 ---- a/kernel/sched/sched.h -+++ b/kernel/sched/sched.h -@@ -1897,14 +1897,16 @@ static inline bool sched_fair_runnable(struct rq *rq) - extern struct task_struct *pick_next_task_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf); - extern struct task_struct *pick_next_task_idle(struct rq *rq); - -+#define SCA_CHECK 0x01 -+#define SCA_MIGRATE_DISABLE 0x02 -+#define SCA_MIGRATE_ENABLE 0x04 -+ - #ifdef CONFIG_SMP - - extern void update_group_capacity(struct sched_domain *sd, int cpu); - - extern void trigger_load_balance(struct rq *rq); - --#define SCA_CHECK 0x01 -- - extern void set_cpus_allowed_common(struct task_struct *p, const struct cpumask *new_mask, u32 flags); - - #endif -diff --git a/lib/smp_processor_id.c b/lib/smp_processor_id.c -index 525222e4f409..faaa927ac2c8 100644 ---- a/lib/smp_processor_id.c -+++ b/lib/smp_processor_id.c -@@ -26,6 +26,11 @@ unsigned int check_preemption_disabled(const char *what1, const char *what2) - if (current->nr_cpus_allowed == 1) - goto out; - -+#if defined(CONFIG_SMP) && defined(CONFIG_PREEMPT_RT) -+ if (current->migration_disabled) -+ goto out; -+#endif -+ - /* - * It is valid to assume CPU-locality during early bootup: - */ --- -2.30.2 - diff --git a/debian/patches-rt/0011-sched-Fix-migrate_disable-vs-set_cpus_allowed_ptr.patch b/debian/patches-rt/0011-sched-Fix-migrate_disable-vs-set_cpus_allowed_ptr.patch deleted file mode 100644 index c77e704b1..000000000 --- a/debian/patches-rt/0011-sched-Fix-migrate_disable-vs-set_cpus_allowed_ptr.patch +++ /dev/null @@ -1,370 +0,0 @@ -From 8f1eeef6b8cdb57c1364c40b12afb4464d3b78d0 Mon Sep 17 00:00:00 2001 -From: Peter Zijlstra <peterz@infradead.org> -Date: Fri, 23 Oct 2020 12:12:08 +0200 -Subject: [PATCH 011/296] sched: Fix migrate_disable() vs - set_cpus_allowed_ptr() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Concurrent migrate_disable() and set_cpus_allowed_ptr() has -interesting features. We rely on set_cpus_allowed_ptr() to not return -until the task runs inside the provided mask. This expectation is -exported to userspace. - -This means that any set_cpus_allowed_ptr() caller must wait until -migrate_enable() allows migrations. - -At the same time, we don't want migrate_enable() to schedule, due to -patterns like: - - preempt_disable(); - migrate_disable(); - ... - migrate_enable(); - preempt_enable(); - -And: - - raw_spin_lock(&B); - spin_unlock(&A); - -this means that when migrate_enable() must restore the affinity -mask, it cannot wait for completion thereof. Luck will have it that -that is exactly the case where there is a pending -set_cpus_allowed_ptr(), so let that provide storage for the async stop -machine. - -Much thanks to Valentin who used TLA+ most effective and found lots of -'interesting' cases. - -Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/sched.h | 1 + - kernel/sched/core.c | 234 ++++++++++++++++++++++++++++++++++++------ - 2 files changed, 205 insertions(+), 30 deletions(-) - -diff --git a/include/linux/sched.h b/include/linux/sched.h -index 8efed5834170..71accd93e1af 100644 ---- a/include/linux/sched.h -+++ b/include/linux/sched.h -@@ -722,6 +722,7 @@ struct task_struct { - int nr_cpus_allowed; - const cpumask_t *cpus_ptr; - cpumask_t cpus_mask; -+ void *migration_pending; - #if defined(CONFIG_SMP) && defined(CONFIG_PREEMPT_RT) - int migration_disabled; - #endif -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index 583b1312f5c6..9dac9131d313 100644 ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -1730,15 +1730,26 @@ void migrate_enable(void) - { - struct task_struct *p = current; - -- if (--p->migration_disabled) -+ if (p->migration_disabled > 1) { -+ p->migration_disabled--; - return; -+ } - -+ /* -+ * Ensure stop_task runs either before or after this, and that -+ * __set_cpus_allowed_ptr(SCA_MIGRATE_ENABLE) doesn't schedule(). -+ */ -+ preempt_disable(); -+ if (p->cpus_ptr != &p->cpus_mask) -+ __set_cpus_allowed_ptr(p, &p->cpus_mask, SCA_MIGRATE_ENABLE); -+ /* -+ * Mustn't clear migration_disabled() until cpus_ptr points back at the -+ * regular cpus_mask, otherwise things that race (eg. -+ * select_fallback_rq) get confused. -+ */ - barrier(); -- -- if (p->cpus_ptr == &p->cpus_mask) -- return; -- -- __set_cpus_allowed_ptr(p, &p->cpus_mask, SCA_MIGRATE_ENABLE); -+ p->migration_disabled = 0; -+ preempt_enable(); - } - EXPORT_SYMBOL_GPL(migrate_enable); - -@@ -1803,8 +1814,16 @@ static struct rq *move_queued_task(struct rq *rq, struct rq_flags *rf, - } - - struct migration_arg { -- struct task_struct *task; -- int dest_cpu; -+ struct task_struct *task; -+ int dest_cpu; -+ struct set_affinity_pending *pending; -+}; -+ -+struct set_affinity_pending { -+ refcount_t refs; -+ struct completion done; -+ struct cpu_stop_work stop_work; -+ struct migration_arg arg; - }; - - /* -@@ -1836,16 +1855,19 @@ static struct rq *__migrate_task(struct rq *rq, struct rq_flags *rf, - */ - static int migration_cpu_stop(void *data) - { -+ struct set_affinity_pending *pending; - struct migration_arg *arg = data; - struct task_struct *p = arg->task; -+ int dest_cpu = arg->dest_cpu; - struct rq *rq = this_rq(); -+ bool complete = false; - struct rq_flags rf; - - /* - * The original target CPU might have gone down and we might - * be on another CPU but it doesn't matter. - */ -- local_irq_disable(); -+ local_irq_save(rf.flags); - /* - * We need to explicitly wake pending tasks before running - * __migrate_task() such that we will not miss enforcing cpus_ptr -@@ -1855,21 +1877,83 @@ static int migration_cpu_stop(void *data) - - raw_spin_lock(&p->pi_lock); - rq_lock(rq, &rf); -+ -+ pending = p->migration_pending; - /* - * If task_rq(p) != rq, it cannot be migrated here, because we're - * holding rq->lock, if p->on_rq == 0 it cannot get enqueued because - * we're holding p->pi_lock. - */ - if (task_rq(p) == rq) { -+ if (is_migration_disabled(p)) -+ goto out; -+ -+ if (pending) { -+ p->migration_pending = NULL; -+ complete = true; -+ } -+ -+ /* migrate_enable() -- we must not race against SCA */ -+ if (dest_cpu < 0) { -+ /* -+ * When this was migrate_enable() but we no longer -+ * have a @pending, a concurrent SCA 'fixed' things -+ * and we should be valid again. Nothing to do. -+ */ -+ if (!pending) { -+ WARN_ON_ONCE(!is_cpu_allowed(p, cpu_of(rq))); -+ goto out; -+ } -+ -+ dest_cpu = cpumask_any_distribute(&p->cpus_mask); -+ } -+ - if (task_on_rq_queued(p)) -- rq = __migrate_task(rq, &rf, p, arg->dest_cpu); -+ rq = __migrate_task(rq, &rf, p, dest_cpu); - else -- p->wake_cpu = arg->dest_cpu; -+ p->wake_cpu = dest_cpu; -+ -+ } else if (dest_cpu < 0) { -+ /* -+ * This happens when we get migrated between migrate_enable()'s -+ * preempt_enable() and scheduling the stopper task. At that -+ * point we're a regular task again and not current anymore. -+ * -+ * A !PREEMPT kernel has a giant hole here, which makes it far -+ * more likely. -+ */ -+ -+ /* -+ * When this was migrate_enable() but we no longer have an -+ * @pending, a concurrent SCA 'fixed' things and we should be -+ * valid again. Nothing to do. -+ */ -+ if (!pending) { -+ WARN_ON_ONCE(!is_cpu_allowed(p, cpu_of(rq))); -+ goto out; -+ } -+ -+ /* -+ * When migrate_enable() hits a rq mis-match we can't reliably -+ * determine is_migration_disabled() and so have to chase after -+ * it. -+ */ -+ task_rq_unlock(rq, p, &rf); -+ stop_one_cpu_nowait(task_cpu(p), migration_cpu_stop, -+ &pending->arg, &pending->stop_work); -+ return 0; - } -- rq_unlock(rq, &rf); -- raw_spin_unlock(&p->pi_lock); -+out: -+ task_rq_unlock(rq, p, &rf); -+ -+ if (complete) -+ complete_all(&pending->done); -+ -+ /* For pending->{arg,stop_work} */ -+ pending = arg->pending; -+ if (pending && refcount_dec_and_test(&pending->refs)) -+ wake_up_var(&pending->refs); - -- local_irq_enable(); - return 0; - } - -@@ -1938,6 +2022,110 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask) - __do_set_cpus_allowed(p, new_mask, 0); - } - -+/* -+ * This function is wildly self concurrent, consider at least 3 times. -+ */ -+static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flags *rf, -+ int dest_cpu, unsigned int flags) -+{ -+ struct set_affinity_pending my_pending = { }, *pending = NULL; -+ struct migration_arg arg = { -+ .task = p, -+ .dest_cpu = dest_cpu, -+ }; -+ bool complete = false; -+ -+ /* Can the task run on the task's current CPU? If so, we're done */ -+ if (cpumask_test_cpu(task_cpu(p), &p->cpus_mask)) { -+ pending = p->migration_pending; -+ if (pending) { -+ refcount_inc(&pending->refs); -+ p->migration_pending = NULL; -+ complete = true; -+ } -+ task_rq_unlock(rq, p, rf); -+ -+ if (complete) -+ goto do_complete; -+ -+ return 0; -+ } -+ -+ if (!(flags & SCA_MIGRATE_ENABLE)) { -+ /* serialized by p->pi_lock */ -+ if (!p->migration_pending) { -+ refcount_set(&my_pending.refs, 1); -+ init_completion(&my_pending.done); -+ p->migration_pending = &my_pending; -+ } else { -+ pending = p->migration_pending; -+ refcount_inc(&pending->refs); -+ } -+ } -+ pending = p->migration_pending; -+ /* -+ * - !MIGRATE_ENABLE: -+ * we'll have installed a pending if there wasn't one already. -+ * -+ * - MIGRATE_ENABLE: -+ * we're here because the current CPU isn't matching anymore, -+ * the only way that can happen is because of a concurrent -+ * set_cpus_allowed_ptr() call, which should then still be -+ * pending completion. -+ * -+ * Either way, we really should have a @pending here. -+ */ -+ if (WARN_ON_ONCE(!pending)) -+ return -EINVAL; -+ -+ if (flags & SCA_MIGRATE_ENABLE) { -+ -+ refcount_inc(&pending->refs); /* pending->{arg,stop_work} */ -+ task_rq_unlock(rq, p, rf); -+ -+ pending->arg = (struct migration_arg) { -+ .task = p, -+ .dest_cpu = -1, -+ .pending = pending, -+ }; -+ -+ stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop, -+ &pending->arg, &pending->stop_work); -+ -+ return 0; -+ } -+ -+ if (task_running(rq, p) || p->state == TASK_WAKING) { -+ -+ task_rq_unlock(rq, p, rf); -+ stop_one_cpu(cpu_of(rq), migration_cpu_stop, &arg); -+ -+ } else { -+ -+ if (!is_migration_disabled(p)) { -+ if (task_on_rq_queued(p)) -+ rq = move_queued_task(rq, rf, p, dest_cpu); -+ -+ p->migration_pending = NULL; -+ complete = true; -+ } -+ task_rq_unlock(rq, p, rf); -+ -+do_complete: -+ if (complete) -+ complete_all(&pending->done); -+ } -+ -+ wait_for_completion(&pending->done); -+ -+ if (refcount_dec_and_test(&pending->refs)) -+ wake_up_var(&pending->refs); -+ -+ wait_var_event(&my_pending.refs, !refcount_read(&my_pending.refs)); -+ -+ return 0; -+} -+ - /* - * Change a given task's CPU affinity. Migrate the thread to a - * proper CPU and schedule it away if the CPU it's executing on -@@ -2007,23 +2195,8 @@ static int __set_cpus_allowed_ptr(struct task_struct *p, - p->nr_cpus_allowed != 1); - } - -- /* Can the task run on the task's current CPU? If so, we're done */ -- if (cpumask_test_cpu(task_cpu(p), new_mask)) -- goto out; -+ return affine_move_task(rq, p, &rf, dest_cpu, flags); - -- if (task_running(rq, p) || p->state == TASK_WAKING) { -- struct migration_arg arg = { p, dest_cpu }; -- /* Need help from migration thread: drop lock and wait. */ -- task_rq_unlock(rq, p, &rf); -- stop_one_cpu(cpu_of(rq), migration_cpu_stop, &arg); -- return 0; -- } else if (task_on_rq_queued(p)) { -- /* -- * OK, since we're going to drop the lock immediately -- * afterwards anyway. -- */ -- rq = move_queued_task(rq, &rf, p, dest_cpu); -- } - out: - task_rq_unlock(rq, p, &rf); - -@@ -3207,6 +3380,7 @@ static void __sched_fork(unsigned long clone_flags, struct task_struct *p) - init_numa_balancing(clone_flags, p); - #ifdef CONFIG_SMP - p->wake_entry.u_flags = CSD_TYPE_TTWU; -+ p->migration_pending = NULL; - #endif - } - --- -2.30.2 - diff --git a/debian/patches-rt/0012-sched-core-Make-migrate-disable-and-CPU-hotplug-coop.patch b/debian/patches-rt/0012-sched-core-Make-migrate-disable-and-CPU-hotplug-coop.patch deleted file mode 100644 index 56f27e7e2..000000000 --- a/debian/patches-rt/0012-sched-core-Make-migrate-disable-and-CPU-hotplug-coop.patch +++ /dev/null @@ -1,137 +0,0 @@ -From 9cd3e6a8b48eaad5ed8393c3cccf4e9645c95fc8 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Fri, 23 Oct 2020 12:12:09 +0200 -Subject: [PATCH 012/296] sched/core: Make migrate disable and CPU hotplug - cooperative -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -On CPU unplug tasks which are in a migrate disabled region cannot be pushed -to a different CPU until they returned to migrateable state. - -Account the number of tasks on a runqueue which are in a migrate disabled -section and make the hotplug wait mechanism respect that. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/sched/core.c | 36 ++++++++++++++++++++++++++++++------ - kernel/sched/sched.h | 4 ++++ - 2 files changed, 34 insertions(+), 6 deletions(-) - -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index 9dac9131d313..91f2985ee447 100644 ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -1719,10 +1719,17 @@ static void migrate_disable_switch(struct rq *rq, struct task_struct *p) - - void migrate_disable(void) - { -- if (current->migration_disabled++) -+ struct task_struct *p = current; -+ -+ if (p->migration_disabled) { -+ p->migration_disabled++; - return; -+ } - -- barrier(); -+ preempt_disable(); -+ this_rq()->nr_pinned++; -+ p->migration_disabled = 1; -+ preempt_enable(); - } - EXPORT_SYMBOL_GPL(migrate_disable); - -@@ -1749,6 +1756,7 @@ void migrate_enable(void) - */ - barrier(); - p->migration_disabled = 0; -+ this_rq()->nr_pinned--; - preempt_enable(); - } - EXPORT_SYMBOL_GPL(migrate_enable); -@@ -1758,6 +1766,11 @@ static inline bool is_migration_disabled(struct task_struct *p) - return p->migration_disabled; - } - -+static inline bool rq_has_pinned_tasks(struct rq *rq) -+{ -+ return rq->nr_pinned; -+} -+ - #endif - - /* -@@ -2689,6 +2702,11 @@ static inline bool is_migration_disabled(struct task_struct *p) - return false; - } - -+static inline bool rq_has_pinned_tasks(struct rq *rq) -+{ -+ return false; -+} -+ - #endif - - static void -@@ -7062,15 +7080,20 @@ static void balance_push(struct rq *rq) - * Both the cpu-hotplug and stop task are in this case and are - * required to complete the hotplug process. - */ -- if (is_per_cpu_kthread(push_task)) { -+ if (is_per_cpu_kthread(push_task) || is_migration_disabled(push_task)) { - /* - * If this is the idle task on the outgoing CPU try to wake - * up the hotplug control thread which might wait for the - * last task to vanish. The rcuwait_active() check is - * accurate here because the waiter is pinned on this CPU - * and can't obviously be running in parallel. -+ * -+ * On RT kernels this also has to check whether there are -+ * pinned and scheduled out tasks on the runqueue. They -+ * need to leave the migrate disabled section first. - */ -- if (!rq->nr_running && rcuwait_active(&rq->hotplug_wait)) { -+ if (!rq->nr_running && !rq_has_pinned_tasks(rq) && -+ rcuwait_active(&rq->hotplug_wait)) { - raw_spin_unlock(&rq->lock); - rcuwait_wake_up(&rq->hotplug_wait); - raw_spin_lock(&rq->lock); -@@ -7117,7 +7140,8 @@ static void balance_hotplug_wait(void) - { - struct rq *rq = this_rq(); - -- rcuwait_wait_event(&rq->hotplug_wait, rq->nr_running == 1, -+ rcuwait_wait_event(&rq->hotplug_wait, -+ rq->nr_running == 1 && !rq_has_pinned_tasks(rq), - TASK_UNINTERRUPTIBLE); - } - -@@ -7362,7 +7386,7 @@ int sched_cpu_dying(unsigned int cpu) - sched_tick_stop(cpu); - - rq_lock_irqsave(rq, &rf); -- BUG_ON(rq->nr_running != 1); -+ BUG_ON(rq->nr_running != 1 || rq_has_pinned_tasks(rq)); - rq_unlock_irqrestore(rq, &rf); - - calc_load_migrate(rq); -diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h -index d2fc1db9d3c7..3be3a1e9179f 100644 ---- a/kernel/sched/sched.h -+++ b/kernel/sched/sched.h -@@ -1048,6 +1048,10 @@ struct rq { - /* Must be inspected within a rcu lock section */ - struct cpuidle_state *idle_state; - #endif -+ -+#if defined(CONFIG_PREEMPT_RT) && defined(CONFIG_SMP) -+ unsigned int nr_pinned; -+#endif - }; - - #ifdef CONFIG_FAIR_GROUP_SCHED --- -2.30.2 - diff --git a/debian/patches-rt/0013-sched-rt-Use-cpumask_any-_distribute.patch b/debian/patches-rt/0013-sched-rt-Use-cpumask_any-_distribute.patch deleted file mode 100644 index 88abeaadc..000000000 --- a/debian/patches-rt/0013-sched-rt-Use-cpumask_any-_distribute.patch +++ /dev/null @@ -1,121 +0,0 @@ -From 659afe991b1acffeacff51f2cbbf979fc06e9465 Mon Sep 17 00:00:00 2001 -From: Peter Zijlstra <peterz@infradead.org> -Date: Fri, 23 Oct 2020 12:12:10 +0200 -Subject: [PATCH 013/296] sched,rt: Use cpumask_any*_distribute() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Replace a bunch of cpumask_any*() instances with -cpumask_any*_distribute(), by injecting this little bit of random in -cpu selection, we reduce the chance two competing balance operations -working off the same lowest_mask pick the same CPU. - -Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/cpumask.h | 6 ++++++ - kernel/sched/deadline.c | 6 +++--- - kernel/sched/rt.c | 6 +++--- - lib/cpumask.c | 18 ++++++++++++++++++ - 4 files changed, 30 insertions(+), 6 deletions(-) - -diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h -index f0d895d6ac39..383684e30f12 100644 ---- a/include/linux/cpumask.h -+++ b/include/linux/cpumask.h -@@ -199,6 +199,11 @@ static inline int cpumask_any_and_distribute(const struct cpumask *src1p, - return cpumask_next_and(-1, src1p, src2p); - } - -+static inline int cpumask_any_distribute(const struct cpumask *srcp) -+{ -+ return cpumask_first(srcp); -+} -+ - #define for_each_cpu(cpu, mask) \ - for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask) - #define for_each_cpu_not(cpu, mask) \ -@@ -252,6 +257,7 @@ int cpumask_any_but(const struct cpumask *mask, unsigned int cpu); - unsigned int cpumask_local_spread(unsigned int i, int node); - int cpumask_any_and_distribute(const struct cpumask *src1p, - const struct cpumask *src2p); -+int cpumask_any_distribute(const struct cpumask *srcp); - - /** - * for_each_cpu - iterate over every cpu in a mask -diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c -index 9495a26f7e73..3c6b3d13e1b9 100644 ---- a/kernel/sched/deadline.c -+++ b/kernel/sched/deadline.c -@@ -2008,8 +2008,8 @@ static int find_later_rq(struct task_struct *task) - return this_cpu; - } - -- best_cpu = cpumask_first_and(later_mask, -- sched_domain_span(sd)); -+ best_cpu = cpumask_any_and_distribute(later_mask, -+ sched_domain_span(sd)); - /* - * Last chance: if a CPU being in both later_mask - * and current sd span is valid, that becomes our -@@ -2031,7 +2031,7 @@ static int find_later_rq(struct task_struct *task) - if (this_cpu != -1) - return this_cpu; - -- cpu = cpumask_any(later_mask); -+ cpu = cpumask_any_distribute(later_mask); - if (cpu < nr_cpu_ids) - return cpu; - -diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c -index 40a46639f78a..2525a1beed26 100644 ---- a/kernel/sched/rt.c -+++ b/kernel/sched/rt.c -@@ -1752,8 +1752,8 @@ static int find_lowest_rq(struct task_struct *task) - return this_cpu; - } - -- best_cpu = cpumask_first_and(lowest_mask, -- sched_domain_span(sd)); -+ best_cpu = cpumask_any_and_distribute(lowest_mask, -+ sched_domain_span(sd)); - if (best_cpu < nr_cpu_ids) { - rcu_read_unlock(); - return best_cpu; -@@ -1770,7 +1770,7 @@ static int find_lowest_rq(struct task_struct *task) - if (this_cpu != -1) - return this_cpu; - -- cpu = cpumask_any(lowest_mask); -+ cpu = cpumask_any_distribute(lowest_mask); - if (cpu < nr_cpu_ids) - return cpu; - -diff --git a/lib/cpumask.c b/lib/cpumask.c -index fb22fb266f93..c3c76b833384 100644 ---- a/lib/cpumask.c -+++ b/lib/cpumask.c -@@ -261,3 +261,21 @@ int cpumask_any_and_distribute(const struct cpumask *src1p, - return next; - } - EXPORT_SYMBOL(cpumask_any_and_distribute); -+ -+int cpumask_any_distribute(const struct cpumask *srcp) -+{ -+ int next, prev; -+ -+ /* NOTE: our first selection will skip 0. */ -+ prev = __this_cpu_read(distribute_cpu_mask_prev); -+ -+ next = cpumask_next(prev, srcp); -+ if (next >= nr_cpu_ids) -+ next = cpumask_first(srcp); -+ -+ if (next < nr_cpu_ids) -+ __this_cpu_write(distribute_cpu_mask_prev, next); -+ -+ return next; -+} -+EXPORT_SYMBOL(cpumask_any_distribute); --- -2.30.2 - diff --git a/debian/patches-rt/0014-sched-rt-Use-the-full-cpumask-for-balancing.patch b/debian/patches-rt/0014-sched-rt-Use-the-full-cpumask-for-balancing.patch deleted file mode 100644 index 0f256c65f..000000000 --- a/debian/patches-rt/0014-sched-rt-Use-the-full-cpumask-for-balancing.patch +++ /dev/null @@ -1,105 +0,0 @@ -From 1576f24789dac95172fede096e07843221fef58c Mon Sep 17 00:00:00 2001 -From: Peter Zijlstra <peterz@infradead.org> -Date: Fri, 23 Oct 2020 12:12:11 +0200 -Subject: [PATCH 014/296] sched,rt: Use the full cpumask for balancing -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -We want migrate_disable() tasks to get PULLs in order for them to PUSH -away the higher priority task. - -Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/sched/cpudeadline.c | 4 ++-- - kernel/sched/cpupri.c | 4 ++-- - kernel/sched/deadline.c | 4 ++-- - kernel/sched/rt.c | 4 ++-- - 4 files changed, 8 insertions(+), 8 deletions(-) - -diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c -index 8cb06c8c7eb1..ceb03d76c0cc 100644 ---- a/kernel/sched/cpudeadline.c -+++ b/kernel/sched/cpudeadline.c -@@ -120,7 +120,7 @@ int cpudl_find(struct cpudl *cp, struct task_struct *p, - const struct sched_dl_entity *dl_se = &p->dl; - - if (later_mask && -- cpumask_and(later_mask, cp->free_cpus, p->cpus_ptr)) { -+ cpumask_and(later_mask, cp->free_cpus, &p->cpus_mask)) { - unsigned long cap, max_cap = 0; - int cpu, max_cpu = -1; - -@@ -151,7 +151,7 @@ int cpudl_find(struct cpudl *cp, struct task_struct *p, - - WARN_ON(best_cpu != -1 && !cpu_present(best_cpu)); - -- if (cpumask_test_cpu(best_cpu, p->cpus_ptr) && -+ if (cpumask_test_cpu(best_cpu, &p->cpus_mask) && - dl_time_before(dl_se->deadline, cp->elements[0].dl)) { - if (later_mask) - cpumask_set_cpu(best_cpu, later_mask); -diff --git a/kernel/sched/cpupri.c b/kernel/sched/cpupri.c -index 0033731a0797..11c4df2010de 100644 ---- a/kernel/sched/cpupri.c -+++ b/kernel/sched/cpupri.c -@@ -73,11 +73,11 @@ static inline int __cpupri_find(struct cpupri *cp, struct task_struct *p, - if (skip) - return 0; - -- if (cpumask_any_and(p->cpus_ptr, vec->mask) >= nr_cpu_ids) -+ if (cpumask_any_and(&p->cpus_mask, vec->mask) >= nr_cpu_ids) - return 0; - - if (lowest_mask) { -- cpumask_and(lowest_mask, p->cpus_ptr, vec->mask); -+ cpumask_and(lowest_mask, &p->cpus_mask, vec->mask); - - /* - * We have to ensure that we have at least one bit -diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c -index 3c6b3d13e1b9..f38650b776ab 100644 ---- a/kernel/sched/deadline.c -+++ b/kernel/sched/deadline.c -@@ -1918,7 +1918,7 @@ static void task_fork_dl(struct task_struct *p) - static int pick_dl_task(struct rq *rq, struct task_struct *p, int cpu) - { - if (!task_running(rq, p) && -- cpumask_test_cpu(cpu, p->cpus_ptr)) -+ cpumask_test_cpu(cpu, &p->cpus_mask)) - return 1; - return 0; - } -@@ -2068,7 +2068,7 @@ static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq) - /* Retry if something changed. */ - if (double_lock_balance(rq, later_rq)) { - if (unlikely(task_rq(task) != rq || -- !cpumask_test_cpu(later_rq->cpu, task->cpus_ptr) || -+ !cpumask_test_cpu(later_rq->cpu, &task->cpus_mask) || - task_running(rq, task) || - !dl_task(task) || - !task_on_rq_queued(task))) { -diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c -index 2525a1beed26..cf63346a07e4 100644 ---- a/kernel/sched/rt.c -+++ b/kernel/sched/rt.c -@@ -1658,7 +1658,7 @@ static void put_prev_task_rt(struct rq *rq, struct task_struct *p) - static int pick_rt_task(struct rq *rq, struct task_struct *p, int cpu) - { - if (!task_running(rq, p) && -- cpumask_test_cpu(cpu, p->cpus_ptr)) -+ cpumask_test_cpu(cpu, &p->cpus_mask)) - return 1; - - return 0; -@@ -1811,7 +1811,7 @@ static struct rq *find_lock_lowest_rq(struct task_struct *task, struct rq *rq) - * Also make sure that it wasn't scheduled on its rq. - */ - if (unlikely(task_rq(task) != rq || -- !cpumask_test_cpu(lowest_rq->cpu, task->cpus_ptr) || -+ !cpumask_test_cpu(lowest_rq->cpu, &task->cpus_mask) || - task_running(rq, task) || - !rt_task(task) || - !task_on_rq_queued(task))) { --- -2.30.2 - diff --git a/debian/patches-rt/0015-sched-lockdep-Annotate-pi_lock-recursion.patch b/debian/patches-rt/0015-sched-lockdep-Annotate-pi_lock-recursion.patch deleted file mode 100644 index c3f45136f..000000000 --- a/debian/patches-rt/0015-sched-lockdep-Annotate-pi_lock-recursion.patch +++ /dev/null @@ -1,52 +0,0 @@ -From 3dbc2728c03b2fa44549480e280a02acdc006bc7 Mon Sep 17 00:00:00 2001 -From: Peter Zijlstra <peterz@infradead.org> -Date: Fri, 23 Oct 2020 12:12:12 +0200 -Subject: [PATCH 015/296] sched, lockdep: Annotate ->pi_lock recursion -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -There's a valid ->pi_lock recursion issue where the actual PI code -tries to wake up the stop task. Make lockdep aware so it doesn't -complain about this. - -Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/sched/core.c | 15 +++++++++++++++ - 1 file changed, 15 insertions(+) - -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index 91f2985ee447..90e87c34d5dd 100644 ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -2654,6 +2654,7 @@ int select_task_rq(struct task_struct *p, int cpu, int sd_flags, int wake_flags) - - void sched_set_stop_task(int cpu, struct task_struct *stop) - { -+ static struct lock_class_key stop_pi_lock; - struct sched_param param = { .sched_priority = MAX_RT_PRIO - 1 }; - struct task_struct *old_stop = cpu_rq(cpu)->stop; - -@@ -2669,6 +2670,20 @@ void sched_set_stop_task(int cpu, struct task_struct *stop) - sched_setscheduler_nocheck(stop, SCHED_FIFO, ¶m); - - stop->sched_class = &stop_sched_class; -+ -+ /* -+ * The PI code calls rt_mutex_setprio() with ->pi_lock held to -+ * adjust the effective priority of a task. As a result, -+ * rt_mutex_setprio() can trigger (RT) balancing operations, -+ * which can then trigger wakeups of the stop thread to push -+ * around the current task. -+ * -+ * The stop task itself will never be part of the PI-chain, it -+ * never blocks, therefore that ->pi_lock recursion is safe. -+ * Tell lockdep about this by placing the stop->pi_lock in its -+ * own class. -+ */ -+ lockdep_set_class(&stop->pi_lock, &stop_pi_lock); - } - - cpu_rq(cpu)->stop = stop; --- -2.30.2 - diff --git a/debian/patches-rt/0016-sched-Fix-migrate_disable-vs-rt-dl-balancing.patch b/debian/patches-rt/0016-sched-Fix-migrate_disable-vs-rt-dl-balancing.patch deleted file mode 100644 index 8de3e5cb7..000000000 --- a/debian/patches-rt/0016-sched-Fix-migrate_disable-vs-rt-dl-balancing.patch +++ /dev/null @@ -1,495 +0,0 @@ -From 077b7f7f8aa0aaab11e16de1f9650814086b1475 Mon Sep 17 00:00:00 2001 -From: Peter Zijlstra <peterz@infradead.org> -Date: Fri, 23 Oct 2020 12:12:13 +0200 -Subject: [PATCH 016/296] sched: Fix migrate_disable() vs rt/dl balancing -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -In order to minimize the interference of migrate_disable() on lower -priority tasks, which can be deprived of runtime due to being stuck -below a higher priority task. Teach the RT/DL balancers to push away -these higher priority tasks when a lower priority task gets selected -to run on a freshly demoted CPU (pull). - -This adds migration interference to the higher priority task, but -restores bandwidth to system that would otherwise be irrevocably lost. -Without this it would be possible to have all tasks on the system -stuck on a single CPU, each task preempted in a migrate_disable() -section with a single high priority task running. - -This way we can still approximate running the M highest priority tasks -on the system. - -Migrating the top task away is (ofcourse) still subject to -migrate_disable() too, which means the lower task is subject to an -interference equivalent to the worst case migrate_disable() section. - -Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/preempt.h | 40 +++++++++++++----------- - include/linux/sched.h | 3 +- - kernel/sched/core.c | 67 +++++++++++++++++++++++++++++++++++------ - kernel/sched/deadline.c | 29 +++++++++++++----- - kernel/sched/rt.c | 63 ++++++++++++++++++++++++++++++-------- - kernel/sched/sched.h | 32 ++++++++++++++++++++ - 6 files changed, 186 insertions(+), 48 deletions(-) - -diff --git a/include/linux/preempt.h b/include/linux/preempt.h -index 97ba7c920653..8b43922e65df 100644 ---- a/include/linux/preempt.h -+++ b/include/linux/preempt.h -@@ -325,24 +325,28 @@ static inline void preempt_notifier_init(struct preempt_notifier *notifier, - #if defined(CONFIG_SMP) && defined(CONFIG_PREEMPT_RT) - - /* -- * Migrate-Disable and why it is (strongly) undesired. -- * -- * The premise of the Real-Time schedulers we have on Linux -- * (SCHED_FIFO/SCHED_DEADLINE) is that M CPUs can/will run M tasks -- * concurrently, provided there are sufficient runnable tasks, also known as -- * work-conserving. For instance SCHED_DEADLINE tries to schedule the M -- * earliest deadline threads, and SCHED_FIFO the M highest priority threads. -- * -- * The correctness of various scheduling models depends on this, but is it -- * broken by migrate_disable() that doesn't imply preempt_disable(). Where -- * preempt_disable() implies an immediate priority ceiling, preemptible -- * migrate_disable() allows nesting. -- * -- * The worst case is that all tasks preempt one another in a migrate_disable() -- * region and stack on a single CPU. This then reduces the available bandwidth -- * to a single CPU. And since Real-Time schedulability theory considers the -- * Worst-Case only, all Real-Time analysis shall revert to single-CPU -- * (instantly solving the SMP analysis problem). -+ * Migrate-Disable and why it is undesired. -+ * -+ * When a preempted task becomes elegible to run under the ideal model (IOW it -+ * becomes one of the M highest priority tasks), it might still have to wait -+ * for the preemptee's migrate_disable() section to complete. Thereby suffering -+ * a reduction in bandwidth in the exact duration of the migrate_disable() -+ * section. -+ * -+ * Per this argument, the change from preempt_disable() to migrate_disable() -+ * gets us: -+ * -+ * - a higher priority tasks gains reduced wake-up latency; with preempt_disable() -+ * it would have had to wait for the lower priority task. -+ * -+ * - a lower priority tasks; which under preempt_disable() could've instantly -+ * migrated away when another CPU becomes available, is now constrained -+ * by the ability to push the higher priority task away, which might itself be -+ * in a migrate_disable() section, reducing it's available bandwidth. -+ * -+ * IOW it trades latency / moves the interference term, but it stays in the -+ * system, and as long as it remains unbounded, the system is not fully -+ * deterministic. - * - * - * The reason we have it anyway. -diff --git a/include/linux/sched.h b/include/linux/sched.h -index 71accd93e1af..b47c446ecf48 100644 ---- a/include/linux/sched.h -+++ b/include/linux/sched.h -@@ -724,8 +724,9 @@ struct task_struct { - cpumask_t cpus_mask; - void *migration_pending; - #if defined(CONFIG_SMP) && defined(CONFIG_PREEMPT_RT) -- int migration_disabled; -+ unsigned short migration_disabled; - #endif -+ unsigned short migration_flags; - - #ifdef CONFIG_PREEMPT_RCU - int rcu_read_lock_nesting; -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index 90e87c34d5dd..dd9c1fed70a6 100644 ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -1761,11 +1761,6 @@ void migrate_enable(void) - } - EXPORT_SYMBOL_GPL(migrate_enable); - --static inline bool is_migration_disabled(struct task_struct *p) --{ -- return p->migration_disabled; --} -- - static inline bool rq_has_pinned_tasks(struct rq *rq) - { - return rq->nr_pinned; -@@ -1970,6 +1965,49 @@ static int migration_cpu_stop(void *data) - return 0; - } - -+int push_cpu_stop(void *arg) -+{ -+ struct rq *lowest_rq = NULL, *rq = this_rq(); -+ struct task_struct *p = arg; -+ -+ raw_spin_lock_irq(&p->pi_lock); -+ raw_spin_lock(&rq->lock); -+ -+ if (task_rq(p) != rq) -+ goto out_unlock; -+ -+ if (is_migration_disabled(p)) { -+ p->migration_flags |= MDF_PUSH; -+ goto out_unlock; -+ } -+ -+ p->migration_flags &= ~MDF_PUSH; -+ -+ if (p->sched_class->find_lock_rq) -+ lowest_rq = p->sched_class->find_lock_rq(p, rq); -+ -+ if (!lowest_rq) -+ goto out_unlock; -+ -+ // XXX validate p is still the highest prio task -+ if (task_rq(p) == rq) { -+ deactivate_task(rq, p, 0); -+ set_task_cpu(p, lowest_rq->cpu); -+ activate_task(lowest_rq, p, 0); -+ resched_curr(lowest_rq); -+ } -+ -+ double_unlock_balance(rq, lowest_rq); -+ -+out_unlock: -+ rq->push_busy = false; -+ raw_spin_unlock(&rq->lock); -+ raw_spin_unlock_irq(&p->pi_lock); -+ -+ put_task_struct(p); -+ return 0; -+} -+ - /* - * sched_class::set_cpus_allowed must do the below, but is not required to - * actually call this function. -@@ -2050,6 +2088,14 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag - - /* Can the task run on the task's current CPU? If so, we're done */ - if (cpumask_test_cpu(task_cpu(p), &p->cpus_mask)) { -+ struct task_struct *push_task = NULL; -+ -+ if ((flags & SCA_MIGRATE_ENABLE) && -+ (p->migration_flags & MDF_PUSH) && !rq->push_busy) { -+ rq->push_busy = true; -+ push_task = get_task_struct(p); -+ } -+ - pending = p->migration_pending; - if (pending) { - refcount_inc(&pending->refs); -@@ -2058,6 +2104,11 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag - } - task_rq_unlock(rq, p, rf); - -+ if (push_task) { -+ stop_one_cpu_nowait(rq->cpu, push_cpu_stop, -+ p, &rq->push_work); -+ } -+ - if (complete) - goto do_complete; - -@@ -2094,6 +2145,7 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag - if (flags & SCA_MIGRATE_ENABLE) { - - refcount_inc(&pending->refs); /* pending->{arg,stop_work} */ -+ p->migration_flags &= ~MDF_PUSH; - task_rq_unlock(rq, p, rf); - - pending->arg = (struct migration_arg) { -@@ -2712,11 +2764,6 @@ static inline int __set_cpus_allowed_ptr(struct task_struct *p, - - static inline void migrate_disable_switch(struct rq *rq, struct task_struct *p) { } - --static inline bool is_migration_disabled(struct task_struct *p) --{ -- return false; --} -- - static inline bool rq_has_pinned_tasks(struct rq *rq) - { - return false; -diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c -index f38650b776ab..8d615aacaafc 100644 ---- a/kernel/sched/deadline.c -+++ b/kernel/sched/deadline.c -@@ -2135,6 +2135,9 @@ static int push_dl_task(struct rq *rq) - return 0; - - retry: -+ if (is_migration_disabled(next_task)) -+ return 0; -+ - if (WARN_ON(next_task == rq->curr)) - return 0; - -@@ -2212,7 +2215,7 @@ static void push_dl_tasks(struct rq *rq) - static void pull_dl_task(struct rq *this_rq) - { - int this_cpu = this_rq->cpu, cpu; -- struct task_struct *p; -+ struct task_struct *p, *push_task; - bool resched = false; - struct rq *src_rq; - u64 dmin = LONG_MAX; -@@ -2242,6 +2245,7 @@ static void pull_dl_task(struct rq *this_rq) - continue; - - /* Might drop this_rq->lock */ -+ push_task = NULL; - double_lock_balance(this_rq, src_rq); - - /* -@@ -2273,17 +2277,27 @@ static void pull_dl_task(struct rq *this_rq) - src_rq->curr->dl.deadline)) - goto skip; - -- resched = true; -- -- deactivate_task(src_rq, p, 0); -- set_task_cpu(p, this_cpu); -- activate_task(this_rq, p, 0); -- dmin = p->dl.deadline; -+ if (is_migration_disabled(p)) { -+ push_task = get_push_task(src_rq); -+ } else { -+ deactivate_task(src_rq, p, 0); -+ set_task_cpu(p, this_cpu); -+ activate_task(this_rq, p, 0); -+ dmin = p->dl.deadline; -+ resched = true; -+ } - - /* Is there any other task even earlier? */ - } - skip: - double_unlock_balance(this_rq, src_rq); -+ -+ if (push_task) { -+ raw_spin_unlock(&this_rq->lock); -+ stop_one_cpu_nowait(src_rq->cpu, push_cpu_stop, -+ push_task, &src_rq->push_work); -+ raw_spin_lock(&this_rq->lock); -+ } - } - - if (resched) -@@ -2530,6 +2544,7 @@ const struct sched_class dl_sched_class - .rq_online = rq_online_dl, - .rq_offline = rq_offline_dl, - .task_woken = task_woken_dl, -+ .find_lock_rq = find_lock_later_rq, - #endif - - .task_tick = task_tick_dl, -diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c -index cf63346a07e4..c592e47cafed 100644 ---- a/kernel/sched/rt.c -+++ b/kernel/sched/rt.c -@@ -1859,7 +1859,7 @@ static struct task_struct *pick_next_pushable_task(struct rq *rq) - * running task can migrate over to a CPU that is running a task - * of lesser priority. - */ --static int push_rt_task(struct rq *rq) -+static int push_rt_task(struct rq *rq, bool pull) - { - struct task_struct *next_task; - struct rq *lowest_rq; -@@ -1873,6 +1873,34 @@ static int push_rt_task(struct rq *rq) - return 0; - - retry: -+ if (is_migration_disabled(next_task)) { -+ struct task_struct *push_task = NULL; -+ int cpu; -+ -+ if (!pull || rq->push_busy) -+ return 0; -+ -+ cpu = find_lowest_rq(rq->curr); -+ if (cpu == -1 || cpu == rq->cpu) -+ return 0; -+ -+ /* -+ * Given we found a CPU with lower priority than @next_task, -+ * therefore it should be running. However we cannot migrate it -+ * to this other CPU, instead attempt to push the current -+ * running task on this CPU away. -+ */ -+ push_task = get_push_task(rq); -+ if (push_task) { -+ raw_spin_unlock(&rq->lock); -+ stop_one_cpu_nowait(rq->cpu, push_cpu_stop, -+ push_task, &rq->push_work); -+ raw_spin_lock(&rq->lock); -+ } -+ -+ return 0; -+ } -+ - if (WARN_ON(next_task == rq->curr)) - return 0; - -@@ -1927,12 +1955,10 @@ static int push_rt_task(struct rq *rq) - deactivate_task(rq, next_task, 0); - set_task_cpu(next_task, lowest_rq->cpu); - activate_task(lowest_rq, next_task, 0); -- ret = 1; -- - resched_curr(lowest_rq); -+ ret = 1; - - double_unlock_balance(rq, lowest_rq); -- - out: - put_task_struct(next_task); - -@@ -1942,7 +1968,7 @@ static int push_rt_task(struct rq *rq) - static void push_rt_tasks(struct rq *rq) - { - /* push_rt_task will return true if it moved an RT */ -- while (push_rt_task(rq)) -+ while (push_rt_task(rq, false)) - ; - } - -@@ -2095,7 +2121,8 @@ void rto_push_irq_work_func(struct irq_work *work) - */ - if (has_pushable_tasks(rq)) { - raw_spin_lock(&rq->lock); -- push_rt_tasks(rq); -+ while (push_rt_task(rq, true)) -+ ; - raw_spin_unlock(&rq->lock); - } - -@@ -2120,7 +2147,7 @@ static void pull_rt_task(struct rq *this_rq) - { - int this_cpu = this_rq->cpu, cpu; - bool resched = false; -- struct task_struct *p; -+ struct task_struct *p, *push_task; - struct rq *src_rq; - int rt_overload_count = rt_overloaded(this_rq); - -@@ -2167,6 +2194,7 @@ static void pull_rt_task(struct rq *this_rq) - * double_lock_balance, and another CPU could - * alter this_rq - */ -+ push_task = NULL; - double_lock_balance(this_rq, src_rq); - - /* -@@ -2194,11 +2222,14 @@ static void pull_rt_task(struct rq *this_rq) - if (p->prio < src_rq->curr->prio) - goto skip; - -- resched = true; -- -- deactivate_task(src_rq, p, 0); -- set_task_cpu(p, this_cpu); -- activate_task(this_rq, p, 0); -+ if (is_migration_disabled(p)) { -+ push_task = get_push_task(src_rq); -+ } else { -+ deactivate_task(src_rq, p, 0); -+ set_task_cpu(p, this_cpu); -+ activate_task(this_rq, p, 0); -+ resched = true; -+ } - /* - * We continue with the search, just in - * case there's an even higher prio task -@@ -2208,6 +2239,13 @@ static void pull_rt_task(struct rq *this_rq) - } - skip: - double_unlock_balance(this_rq, src_rq); -+ -+ if (push_task) { -+ raw_spin_unlock(&this_rq->lock); -+ stop_one_cpu_nowait(src_rq->cpu, push_cpu_stop, -+ push_task, &src_rq->push_work); -+ raw_spin_lock(&this_rq->lock); -+ } - } - - if (resched) -@@ -2449,6 +2487,7 @@ const struct sched_class rt_sched_class - .rq_offline = rq_offline_rt, - .task_woken = task_woken_rt, - .switched_from = switched_from_rt, -+ .find_lock_rq = find_lock_lowest_rq, - #endif - - .task_tick = task_tick_rt, -diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h -index 3be3a1e9179f..fb8af162ccaf 100644 ---- a/kernel/sched/sched.h -+++ b/kernel/sched/sched.h -@@ -1052,6 +1052,8 @@ struct rq { - #if defined(CONFIG_PREEMPT_RT) && defined(CONFIG_SMP) - unsigned int nr_pinned; - #endif -+ unsigned int push_busy; -+ struct cpu_stop_work push_work; - }; - - #ifdef CONFIG_FAIR_GROUP_SCHED -@@ -1079,6 +1081,16 @@ static inline int cpu_of(struct rq *rq) - #endif - } - -+#define MDF_PUSH 0x01 -+ -+static inline bool is_migration_disabled(struct task_struct *p) -+{ -+#if defined(CONFIG_SMP) && defined(CONFIG_PREEMPT_RT) -+ return p->migration_disabled; -+#else -+ return false; -+#endif -+} - - #ifdef CONFIG_SCHED_SMT - extern void __update_idle_core(struct rq *rq); -@@ -1818,6 +1830,8 @@ struct sched_class { - - void (*rq_online)(struct rq *rq); - void (*rq_offline)(struct rq *rq); -+ -+ struct rq *(*find_lock_rq)(struct task_struct *p, struct rq *rq); - #endif - - void (*task_tick)(struct rq *rq, struct task_struct *p, int queued); -@@ -1913,6 +1927,24 @@ extern void trigger_load_balance(struct rq *rq); - - extern void set_cpus_allowed_common(struct task_struct *p, const struct cpumask *new_mask, u32 flags); - -+static inline struct task_struct *get_push_task(struct rq *rq) -+{ -+ struct task_struct *p = rq->curr; -+ -+ lockdep_assert_held(&rq->lock); -+ -+ if (rq->push_busy) -+ return NULL; -+ -+ if (p->nr_cpus_allowed == 1) -+ return NULL; -+ -+ rq->push_busy = true; -+ return get_task_struct(p); -+} -+ -+extern int push_cpu_stop(void *arg); -+ - #endif - - #ifdef CONFIG_CPU_IDLE --- -2.30.2 - diff --git a/debian/patches-rt/0017-sched-proc-Print-accurate-cpumask-vs-migrate_disable.patch b/debian/patches-rt/0017-sched-proc-Print-accurate-cpumask-vs-migrate_disable.patch deleted file mode 100644 index b1de6ec94..000000000 --- a/debian/patches-rt/0017-sched-proc-Print-accurate-cpumask-vs-migrate_disable.patch +++ /dev/null @@ -1,35 +0,0 @@ -From c35e90c5eaa626ccc0308dadfaa069ed9455ecb7 Mon Sep 17 00:00:00 2001 -From: Peter Zijlstra <peterz@infradead.org> -Date: Fri, 23 Oct 2020 12:12:14 +0200 -Subject: [PATCH 017/296] sched/proc: Print accurate cpumask vs - migrate_disable() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Ensure /proc/*/status doesn't print 'random' cpumasks due to -migrate_disable(). - -Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - fs/proc/array.c | 4 ++-- - 1 file changed, 2 insertions(+), 2 deletions(-) - -diff --git a/fs/proc/array.c b/fs/proc/array.c -index 65ec2029fa80..7052441be967 100644 ---- a/fs/proc/array.c -+++ b/fs/proc/array.c -@@ -382,9 +382,9 @@ static inline void task_context_switch_counts(struct seq_file *m, - static void task_cpus_allowed(struct seq_file *m, struct task_struct *task) - { - seq_printf(m, "Cpus_allowed:\t%*pb\n", -- cpumask_pr_args(task->cpus_ptr)); -+ cpumask_pr_args(&task->cpus_mask)); - seq_printf(m, "Cpus_allowed_list:\t%*pbl\n", -- cpumask_pr_args(task->cpus_ptr)); -+ cpumask_pr_args(&task->cpus_mask)); - } - - static inline void task_core_dumping(struct seq_file *m, struct mm_struct *mm) --- -2.30.2 - diff --git a/debian/patches-rt/0018-sched-Add-migrate_disable-tracepoints.patch b/debian/patches-rt/0018-sched-Add-migrate_disable-tracepoints.patch deleted file mode 100644 index 43ded39ad..000000000 --- a/debian/patches-rt/0018-sched-Add-migrate_disable-tracepoints.patch +++ /dev/null @@ -1,110 +0,0 @@ -From 070fe6fcb7291170e36ec8b8b9ca9709b1222f00 Mon Sep 17 00:00:00 2001 -From: Peter Zijlstra <peterz@infradead.org> -Date: Fri, 23 Oct 2020 12:12:15 +0200 -Subject: [PATCH 018/296] sched: Add migrate_disable() tracepoints -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -XXX write a tracer: - - - 'migirate_disable() -> migrate_enable()' time in task_sched_runtime() - - 'migrate_pull -> sched-in' time in task_sched_runtime() - -The first will give worst case for the second, which is the actual -interference experienced by the task to due migration constraints of -migrate_disable(). - -Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/trace/events/sched.h | 12 ++++++++++++ - kernel/sched/core.c | 4 ++++ - kernel/sched/deadline.c | 1 + - kernel/sched/rt.c | 8 +++++++- - 4 files changed, 24 insertions(+), 1 deletion(-) - -diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h -index c96a4337afe6..e48f584abf5f 100644 ---- a/include/trace/events/sched.h -+++ b/include/trace/events/sched.h -@@ -650,6 +650,18 @@ DECLARE_TRACE(sched_update_nr_running_tp, - TP_PROTO(struct rq *rq, int change), - TP_ARGS(rq, change)); - -+DECLARE_TRACE(sched_migrate_disable_tp, -+ TP_PROTO(struct task_struct *p), -+ TP_ARGS(p)); -+ -+DECLARE_TRACE(sched_migrate_enable_tp, -+ TP_PROTO(struct task_struct *p), -+ TP_ARGS(p)); -+ -+DECLARE_TRACE(sched_migrate_pull_tp, -+ TP_PROTO(struct task_struct *p), -+ TP_ARGS(p)); -+ - #endif /* _TRACE_SCHED_H */ - - /* This part must be outside protection */ -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index dd9c1fed70a6..5cad6f589ffd 100644 ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -1726,6 +1726,8 @@ void migrate_disable(void) - return; - } - -+ trace_sched_migrate_disable_tp(p); -+ - preempt_disable(); - this_rq()->nr_pinned++; - p->migration_disabled = 1; -@@ -1758,6 +1760,8 @@ void migrate_enable(void) - p->migration_disabled = 0; - this_rq()->nr_pinned--; - preempt_enable(); -+ -+ trace_sched_migrate_enable_tp(p); - } - EXPORT_SYMBOL_GPL(migrate_enable); - -diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c -index 8d615aacaafc..56faef8c9238 100644 ---- a/kernel/sched/deadline.c -+++ b/kernel/sched/deadline.c -@@ -2278,6 +2278,7 @@ static void pull_dl_task(struct rq *this_rq) - goto skip; - - if (is_migration_disabled(p)) { -+ trace_sched_migrate_pull_tp(p); - push_task = get_push_task(src_rq); - } else { - deactivate_task(src_rq, p, 0); -diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c -index c592e47cafed..1ed7e3dfee9e 100644 ---- a/kernel/sched/rt.c -+++ b/kernel/sched/rt.c -@@ -1877,7 +1877,12 @@ static int push_rt_task(struct rq *rq, bool pull) - struct task_struct *push_task = NULL; - int cpu; - -- if (!pull || rq->push_busy) -+ if (!pull) -+ return 0; -+ -+ trace_sched_migrate_pull_tp(next_task); -+ -+ if (rq->push_busy) - return 0; - - cpu = find_lowest_rq(rq->curr); -@@ -2223,6 +2228,7 @@ static void pull_rt_task(struct rq *this_rq) - goto skip; - - if (is_migration_disabled(p)) { -+ trace_sched_migrate_pull_tp(p); - push_task = get_push_task(src_rq); - } else { - deactivate_task(src_rq, p, 0); --- -2.30.2 - diff --git a/debian/patches-rt/0019-sched-Deny-self-issued-__set_cpus_allowed_ptr-when-m.patch b/debian/patches-rt/0019-sched-Deny-self-issued-__set_cpus_allowed_ptr-when-m.patch deleted file mode 100644 index 775bf28f5..000000000 --- a/debian/patches-rt/0019-sched-Deny-self-issued-__set_cpus_allowed_ptr-when-m.patch +++ /dev/null @@ -1,46 +0,0 @@ -From 8be4fed27ebe9a9bf02c344f099d3efd325e2688 Mon Sep 17 00:00:00 2001 -From: Valentin Schneider <valentin.schneider@arm.com> -Date: Fri, 23 Oct 2020 12:12:16 +0200 -Subject: [PATCH 019/296] sched: Deny self-issued __set_cpus_allowed_ptr() when - migrate_disable() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - - migrate_disable(); - set_cpus_allowed_ptr(current, {something excluding task_cpu(current)}); - affine_move_task(); <-- never returns - -Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> -Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> -Link: https://lkml.kernel.org/r/20201013140116.26651-1-valentin.schneider@arm.com -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/sched/core.c | 13 +++++++++++-- - 1 file changed, 11 insertions(+), 2 deletions(-) - -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index 5cad6f589ffd..860c083ddc4b 100644 ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -2238,8 +2238,17 @@ static int __set_cpus_allowed_ptr(struct task_struct *p, - goto out; - } - -- if (!(flags & SCA_MIGRATE_ENABLE) && cpumask_equal(&p->cpus_mask, new_mask)) -- goto out; -+ if (!(flags & SCA_MIGRATE_ENABLE)) { -+ if (cpumask_equal(&p->cpus_mask, new_mask)) -+ goto out; -+ -+ if (WARN_ON_ONCE(p == current && -+ is_migration_disabled(p) && -+ !cpumask_test_cpu(task_cpu(p), new_mask))) { -+ ret = -EBUSY; -+ goto out; -+ } -+ } - - /* - * Picking a ~random cpu helps in cases where we are changing affinity --- -2.30.2 - diff --git a/debian/patches-rt/0020-sched-Comment-affine_move_task.patch b/debian/patches-rt/0020-sched-Comment-affine_move_task.patch deleted file mode 100644 index c2b69ac10..000000000 --- a/debian/patches-rt/0020-sched-Comment-affine_move_task.patch +++ /dev/null @@ -1,130 +0,0 @@ -From d215f795cb0996af5ce481483516543e323c0efd Mon Sep 17 00:00:00 2001 -From: Valentin Schneider <valentin.schneider@arm.com> -Date: Fri, 23 Oct 2020 12:12:17 +0200 -Subject: [PATCH 020/296] sched: Comment affine_move_task() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> -Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> -Link: https://lkml.kernel.org/r/20201013140116.26651-2-valentin.schneider@arm.com -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/sched/core.c | 81 +++++++++++++++++++++++++++++++++++++++++++-- - 1 file changed, 79 insertions(+), 2 deletions(-) - -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index 860c083ddc4b..3a1db2775fa4 100644 ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -2078,7 +2078,75 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask) - } - - /* -- * This function is wildly self concurrent, consider at least 3 times. -+ * This function is wildly self concurrent; here be dragons. -+ * -+ * -+ * When given a valid mask, __set_cpus_allowed_ptr() must block until the -+ * designated task is enqueued on an allowed CPU. If that task is currently -+ * running, we have to kick it out using the CPU stopper. -+ * -+ * Migrate-Disable comes along and tramples all over our nice sandcastle. -+ * Consider: -+ * -+ * Initial conditions: P0->cpus_mask = [0, 1] -+ * -+ * P0@CPU0 P1 -+ * -+ * migrate_disable(); -+ * <preempted> -+ * set_cpus_allowed_ptr(P0, [1]); -+ * -+ * P1 *cannot* return from this set_cpus_allowed_ptr() call until P0 executes -+ * its outermost migrate_enable() (i.e. it exits its Migrate-Disable region). -+ * This means we need the following scheme: -+ * -+ * P0@CPU0 P1 -+ * -+ * migrate_disable(); -+ * <preempted> -+ * set_cpus_allowed_ptr(P0, [1]); -+ * <blocks> -+ * <resumes> -+ * migrate_enable(); -+ * __set_cpus_allowed_ptr(); -+ * <wakes local stopper> -+ * `--> <woken on migration completion> -+ * -+ * Now the fun stuff: there may be several P1-like tasks, i.e. multiple -+ * concurrent set_cpus_allowed_ptr(P0, [*]) calls. CPU affinity changes of any -+ * task p are serialized by p->pi_lock, which we can leverage: the one that -+ * should come into effect at the end of the Migrate-Disable region is the last -+ * one. This means we only need to track a single cpumask (i.e. p->cpus_mask), -+ * but we still need to properly signal those waiting tasks at the appropriate -+ * moment. -+ * -+ * This is implemented using struct set_affinity_pending. The first -+ * __set_cpus_allowed_ptr() caller within a given Migrate-Disable region will -+ * setup an instance of that struct and install it on the targeted task_struct. -+ * Any and all further callers will reuse that instance. Those then wait for -+ * a completion signaled at the tail of the CPU stopper callback (1), triggered -+ * on the end of the Migrate-Disable region (i.e. outermost migrate_enable()). -+ * -+ * -+ * (1) In the cases covered above. There is one more where the completion is -+ * signaled within affine_move_task() itself: when a subsequent affinity request -+ * cancels the need for an active migration. Consider: -+ * -+ * Initial conditions: P0->cpus_mask = [0, 1] -+ * -+ * P0@CPU0 P1 P2 -+ * -+ * migrate_disable(); -+ * <preempted> -+ * set_cpus_allowed_ptr(P0, [1]); -+ * <blocks> -+ * set_cpus_allowed_ptr(P0, [0, 1]); -+ * <signal completion> -+ * <awakes> -+ * -+ * Note that the above is safe vs a concurrent migrate_enable(), as any -+ * pending affinity completion is preceded an uninstallion of -+ * p->migration_pending done with p->pi_lock held. - */ - static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flags *rf, - int dest_cpu, unsigned int flags) -@@ -2122,6 +2190,7 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag - if (!(flags & SCA_MIGRATE_ENABLE)) { - /* serialized by p->pi_lock */ - if (!p->migration_pending) { -+ /* Install the request */ - refcount_set(&my_pending.refs, 1); - init_completion(&my_pending.done); - p->migration_pending = &my_pending; -@@ -2165,7 +2234,11 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag - } - - if (task_running(rq, p) || p->state == TASK_WAKING) { -- -+ /* -+ * Lessen races (and headaches) by delegating -+ * is_migration_disabled(p) checks to the stopper, which will -+ * run on the same CPU as said p. -+ */ - task_rq_unlock(rq, p, rf); - stop_one_cpu(cpu_of(rq), migration_cpu_stop, &arg); - -@@ -2190,6 +2263,10 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag - if (refcount_dec_and_test(&pending->refs)) - wake_up_var(&pending->refs); - -+ /* -+ * Block the original owner of &pending until all subsequent callers -+ * have seen the completion and decremented the refcount -+ */ - wait_var_event(&my_pending.refs, !refcount_read(&my_pending.refs)); - - return 0; --- -2.30.2 - diff --git a/debian/patches-rt/0021-sched-Unlock-the-rq-in-affine_move_task-error-path.patch b/debian/patches-rt/0021-sched-Unlock-the-rq-in-affine_move_task-error-path.patch deleted file mode 100644 index 826912b75..000000000 --- a/debian/patches-rt/0021-sched-Unlock-the-rq-in-affine_move_task-error-path.patch +++ /dev/null @@ -1,34 +0,0 @@ -From f78ee74dadf7dd5a7457462540654c0dad15fc21 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Mon, 9 Nov 2020 15:54:03 +0100 -Subject: [PATCH 021/296] sched: Unlock the rq in affine_move_task() error path -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Unlock the rq if returned early in the error path. - -Reported-by: Joe Korty <joe.korty@concurrent-rt.com> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Link: https://lkml.kernel.org/r/20201106203921.GA48461@zipoli.concurrent-rt.com ---- - kernel/sched/core.c | 4 +++- - 1 file changed, 3 insertions(+), 1 deletion(-) - -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index 3a1db2775fa4..fd8dfadba894 100644 ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -2212,8 +2212,10 @@ static int affine_move_task(struct rq *rq, struct task_struct *p, struct rq_flag - * - * Either way, we really should have a @pending here. - */ -- if (WARN_ON_ONCE(!pending)) -+ if (WARN_ON_ONCE(!pending)) { -+ task_rq_unlock(rq, p, rf); - return -EINVAL; -+ } - - if (flags & SCA_MIGRATE_ENABLE) { - --- -2.30.2 - diff --git a/debian/patches-rt/0022-sched-Fix-migration_cpu_stop-WARN.patch b/debian/patches-rt/0022-sched-Fix-migration_cpu_stop-WARN.patch deleted file mode 100644 index b56baab8f..000000000 --- a/debian/patches-rt/0022-sched-Fix-migration_cpu_stop-WARN.patch +++ /dev/null @@ -1,47 +0,0 @@ -From 58a65398b211eaf686a414269ad13439ca8d4064 Mon Sep 17 00:00:00 2001 -From: Peter Zijlstra <peterz@infradead.org> -Date: Tue, 17 Nov 2020 12:14:51 +0100 -Subject: [PATCH 022/296] sched: Fix migration_cpu_stop() WARN -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Oleksandr reported hitting the WARN in the 'task_rq(p) != rq' branch -of migration_cpu_stop(). Valentin noted that using cpu_of(rq) in that -case is just plain wrong to begin with, since per the earlier branch -that isn't the actual CPU of the task. - -Replace both instances of is_cpu_allowed() by a direct p->cpus_mask -test using task_cpu(). - -Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name> -Debugged-by: Valentin Schneider <valentin.schneider@arm.com> -Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/sched/core.c | 4 ++-- - 1 file changed, 2 insertions(+), 2 deletions(-) - -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index fd8dfadba894..ac6704de36d3 100644 ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -1913,7 +1913,7 @@ static int migration_cpu_stop(void *data) - * and we should be valid again. Nothing to do. - */ - if (!pending) { -- WARN_ON_ONCE(!is_cpu_allowed(p, cpu_of(rq))); -+ WARN_ON_ONCE(!cpumask_test_cpu(task_cpu(p), &p->cpus_mask)); - goto out; - } - -@@ -1941,7 +1941,7 @@ static int migration_cpu_stop(void *data) - * valid again. Nothing to do. - */ - if (!pending) { -- WARN_ON_ONCE(!is_cpu_allowed(p, cpu_of(rq))); -+ WARN_ON_ONCE(!cpumask_test_cpu(task_cpu(p), &p->cpus_mask)); - goto out; - } - --- -2.30.2 - diff --git a/debian/patches-rt/0023-sched-core-Add-missing-completion-for-affine_move_ta.patch b/debian/patches-rt/0023-sched-core-Add-missing-completion-for-affine_move_ta.patch deleted file mode 100644 index e1458ff9d..000000000 --- a/debian/patches-rt/0023-sched-core-Add-missing-completion-for-affine_move_ta.patch +++ /dev/null @@ -1,79 +0,0 @@ -From 5cde0b8f5457c33119a117fc1ca5faf7efc7ed59 Mon Sep 17 00:00:00 2001 -From: Valentin Schneider <valentin.schneider@arm.com> -Date: Fri, 13 Nov 2020 11:24:14 +0000 -Subject: [PATCH 023/296] sched/core: Add missing completion for - affine_move_task() waiters -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Qian reported that some fuzzer issuing sched_setaffinity() ends up stuck on -a wait_for_completion(). The problematic pattern seems to be: - - affine_move_task() - // task_running() case - stop_one_cpu(); - wait_for_completion(&pending->done); - -Combined with, on the stopper side: - - migration_cpu_stop() - // Task moved between unlocks and scheduling the stopper - task_rq(p) != rq && - // task_running() case - dest_cpu >= 0 - - => no complete_all() - -This can happen with both PREEMPT and !PREEMPT, although !PREEMPT should -be more likely to see this given the targeted task has a much bigger window -to block and be woken up elsewhere before the stopper runs. - -Make migration_cpu_stop() always look at pending affinity requests; signal -their completion if the stopper hits a rq mismatch but the task is -still within its allowed mask. When Migrate-Disable isn't involved, this -matches the previous set_cpus_allowed_ptr() vs migration_cpu_stop() -behaviour. - -Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()") -Reported-by: Qian Cai <cai@redhat.com> -Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> -Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> -Link: https://lore.kernel.org/lkml/8b62fd1ad1b18def27f18e2ee2df3ff5b36d0762.camel@redhat.com -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/sched/core.c | 13 ++++++++++++- - 1 file changed, 12 insertions(+), 1 deletion(-) - -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index ac6704de36d3..9429059d96e3 100644 ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -1925,7 +1925,7 @@ static int migration_cpu_stop(void *data) - else - p->wake_cpu = dest_cpu; - -- } else if (dest_cpu < 0) { -+ } else if (dest_cpu < 0 || pending) { - /* - * This happens when we get migrated between migrate_enable()'s - * preempt_enable() and scheduling the stopper task. At that -@@ -1935,6 +1935,17 @@ static int migration_cpu_stop(void *data) - * more likely. - */ - -+ /* -+ * The task moved before the stopper got to run. We're holding -+ * ->pi_lock, so the allowed mask is stable - if it got -+ * somewhere allowed, we're done. -+ */ -+ if (pending && cpumask_test_cpu(task_cpu(p), p->cpus_ptr)) { -+ p->migration_pending = NULL; -+ complete = true; -+ goto out; -+ } -+ - /* - * When this was migrate_enable() but we no longer have an - * @pending, a concurrent SCA 'fixed' things and we should be --- -2.30.2 - diff --git a/debian/patches-rt/0024-mm-highmem-Un-EXPORT-__kmap_atomic_idx.patch b/debian/patches-rt/0024-mm-highmem-Un-EXPORT-__kmap_atomic_idx.patch deleted file mode 100644 index c88997357..000000000 --- a/debian/patches-rt/0024-mm-highmem-Un-EXPORT-__kmap_atomic_idx.patch +++ /dev/null @@ -1,33 +0,0 @@ -From ad0f6268fd5f00992c3ac5ac65ea33c7c01d07d2 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:13 +0100 -Subject: [PATCH 024/296] mm/highmem: Un-EXPORT __kmap_atomic_idx() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Nothing in modules can use that. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Reviewed-by: Christoph Hellwig <hch@lst.de> -Cc: Andrew Morton <akpm@linux-foundation.org> -Cc: linux-mm@kvack.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - mm/highmem.c | 2 -- - 1 file changed, 2 deletions(-) - -diff --git a/mm/highmem.c b/mm/highmem.c -index 1352a27951e3..6abfd762eee7 100644 ---- a/mm/highmem.c -+++ b/mm/highmem.c -@@ -108,8 +108,6 @@ static inline wait_queue_head_t *get_pkmap_wait_queue_head(unsigned int color) - atomic_long_t _totalhigh_pages __read_mostly; - EXPORT_SYMBOL(_totalhigh_pages); - --EXPORT_PER_CPU_SYMBOL(__kmap_atomic_idx); -- - unsigned int nr_free_highpages (void) - { - struct zone *zone; --- -2.30.2 - diff --git a/debian/patches-rt/0025-highmem-Remove-unused-functions.patch b/debian/patches-rt/0025-highmem-Remove-unused-functions.patch deleted file mode 100644 index 9db7eb79c..000000000 --- a/debian/patches-rt/0025-highmem-Remove-unused-functions.patch +++ /dev/null @@ -1,43 +0,0 @@ -From 45e892c57ab606aba7cad7fad69e23960c2e7507 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:14 +0100 -Subject: [PATCH 025/296] highmem: Remove unused functions -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Nothing uses totalhigh_pages_dec() and totalhigh_pages_set(). - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/highmem.h | 10 ---------- - 1 file changed, 10 deletions(-) - -diff --git a/include/linux/highmem.h b/include/linux/highmem.h -index 14e6202ce47f..f5c31338f0a3 100644 ---- a/include/linux/highmem.h -+++ b/include/linux/highmem.h -@@ -104,21 +104,11 @@ static inline void totalhigh_pages_inc(void) - atomic_long_inc(&_totalhigh_pages); - } - --static inline void totalhigh_pages_dec(void) --{ -- atomic_long_dec(&_totalhigh_pages); --} -- - static inline void totalhigh_pages_add(long count) - { - atomic_long_add(count, &_totalhigh_pages); - } - --static inline void totalhigh_pages_set(long val) --{ -- atomic_long_set(&_totalhigh_pages, val); --} -- - void kmap_flush_unused(void); - - struct page *kmap_to_page(void *addr); --- -2.30.2 - diff --git a/debian/patches-rt/0026-fs-Remove-asm-kmap_types.h-includes.patch b/debian/patches-rt/0026-fs-Remove-asm-kmap_types.h-includes.patch deleted file mode 100644 index befcc9d4f..000000000 --- a/debian/patches-rt/0026-fs-Remove-asm-kmap_types.h-includes.patch +++ /dev/null @@ -1,50 +0,0 @@ -From 7a0301497eca9332ed06a129d8808a28a278db00 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:15 +0100 -Subject: [PATCH 026/296] fs: Remove asm/kmap_types.h includes -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Historical leftovers from the time where kmap() had fixed slots. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Cc: Alexander Viro <viro@zeniv.linux.org.uk> -Cc: Benjamin LaHaise <bcrl@kvack.org> -Cc: linux-fsdevel@vger.kernel.org -Cc: linux-aio@kvack.org -Cc: Chris Mason <clm@fb.com> -Cc: Josef Bacik <josef@toxicpanda.com> -Cc: David Sterba <dsterba@suse.com> -Cc: linux-btrfs@vger.kernel.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - fs/aio.c | 1 - - fs/btrfs/ctree.h | 1 - - 2 files changed, 2 deletions(-) - -diff --git a/fs/aio.c b/fs/aio.c -index 6a21d8919409..76ce0cc3ee4e 100644 ---- a/fs/aio.c -+++ b/fs/aio.c -@@ -43,7 +43,6 @@ - #include <linux/mount.h> - #include <linux/pseudo_fs.h> - --#include <asm/kmap_types.h> - #include <linux/uaccess.h> - #include <linux/nospec.h> - -diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h -index bcc6848bb6d6..fabbf6cc45bf 100644 ---- a/fs/btrfs/ctree.h -+++ b/fs/btrfs/ctree.h -@@ -17,7 +17,6 @@ - #include <linux/wait.h> - #include <linux/slab.h> - #include <trace/events/btrfs.h> --#include <asm/kmap_types.h> - #include <asm/unaligned.h> - #include <linux/pagemap.h> - #include <linux/btrfs.h> --- -2.30.2 - diff --git a/debian/patches-rt/0027-sh-highmem-Remove-all-traces-of-unused-cruft.patch b/debian/patches-rt/0027-sh-highmem-Remove-all-traces-of-unused-cruft.patch deleted file mode 100644 index 0c6823641..000000000 --- a/debian/patches-rt/0027-sh-highmem-Remove-all-traces-of-unused-cruft.patch +++ /dev/null @@ -1,94 +0,0 @@ -From 1595c4d815b0ca8c87f74e43fbce45668e28898f Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:16 +0100 -Subject: [PATCH 027/296] sh/highmem: Remove all traces of unused cruft -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -For whatever reasons SH has highmem bits all over the place but does -not enable it via Kconfig. Remove the bitrot. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/sh/include/asm/fixmap.h | 8 -------- - arch/sh/include/asm/kmap_types.h | 15 --------------- - arch/sh/mm/init.c | 8 -------- - 3 files changed, 31 deletions(-) - delete mode 100644 arch/sh/include/asm/kmap_types.h - -diff --git a/arch/sh/include/asm/fixmap.h b/arch/sh/include/asm/fixmap.h -index f38adc189b83..b07fbc7f7bc6 100644 ---- a/arch/sh/include/asm/fixmap.h -+++ b/arch/sh/include/asm/fixmap.h -@@ -13,9 +13,6 @@ - #include <linux/kernel.h> - #include <linux/threads.h> - #include <asm/page.h> --#ifdef CONFIG_HIGHMEM --#include <asm/kmap_types.h> --#endif - - /* - * Here we define all the compile-time 'special' virtual -@@ -53,11 +50,6 @@ enum fixed_addresses { - FIX_CMAP_BEGIN, - FIX_CMAP_END = FIX_CMAP_BEGIN + (FIX_N_COLOURS * NR_CPUS) - 1, - --#ifdef CONFIG_HIGHMEM -- FIX_KMAP_BEGIN, /* reserved pte's for temporary kernel mappings */ -- FIX_KMAP_END = FIX_KMAP_BEGIN + (KM_TYPE_NR * NR_CPUS) - 1, --#endif -- - #ifdef CONFIG_IOREMAP_FIXED - /* - * FIX_IOREMAP entries are useful for mapping physical address -diff --git a/arch/sh/include/asm/kmap_types.h b/arch/sh/include/asm/kmap_types.h -deleted file mode 100644 -index b78107f923dd..000000000000 ---- a/arch/sh/include/asm/kmap_types.h -+++ /dev/null -@@ -1,15 +0,0 @@ --/* SPDX-License-Identifier: GPL-2.0 */ --#ifndef __SH_KMAP_TYPES_H --#define __SH_KMAP_TYPES_H -- --/* Dummy header just to define km_type. */ -- --#ifdef CONFIG_DEBUG_HIGHMEM --#define __WITH_KM_FENCE --#endif -- --#include <asm-generic/kmap_types.h> -- --#undef __WITH_KM_FENCE -- --#endif -diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c -index 3348e0c4d769..0db6919af8d3 100644 ---- a/arch/sh/mm/init.c -+++ b/arch/sh/mm/init.c -@@ -362,9 +362,6 @@ void __init mem_init(void) - mem_init_print_info(NULL); - pr_info("virtual kernel memory layout:\n" - " fixmap : 0x%08lx - 0x%08lx (%4ld kB)\n" --#ifdef CONFIG_HIGHMEM -- " pkmap : 0x%08lx - 0x%08lx (%4ld kB)\n" --#endif - " vmalloc : 0x%08lx - 0x%08lx (%4ld MB)\n" - " lowmem : 0x%08lx - 0x%08lx (%4ld MB) (cached)\n" - #ifdef CONFIG_UNCACHED_MAPPING -@@ -376,11 +373,6 @@ void __init mem_init(void) - FIXADDR_START, FIXADDR_TOP, - (FIXADDR_TOP - FIXADDR_START) >> 10, - --#ifdef CONFIG_HIGHMEM -- PKMAP_BASE, PKMAP_BASE+LAST_PKMAP*PAGE_SIZE, -- (LAST_PKMAP*PAGE_SIZE) >> 10, --#endif -- - (unsigned long)VMALLOC_START, VMALLOC_END, - (VMALLOC_END - VMALLOC_START) >> 20, - --- -2.30.2 - diff --git a/debian/patches-rt/0028-asm-generic-Provide-kmap_size.h.patch b/debian/patches-rt/0028-asm-generic-Provide-kmap_size.h.patch deleted file mode 100644 index 9b342e550..000000000 --- a/debian/patches-rt/0028-asm-generic-Provide-kmap_size.h.patch +++ /dev/null @@ -1,70 +0,0 @@ -From 84e921b31ca33d0b6e962e12113a4888aafce89b Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:17 +0100 -Subject: [PATCH 028/296] asm-generic: Provide kmap_size.h -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -kmap_types.h is a misnomer because the old atomic MAP based array does not -exist anymore and the whole indirection of architectures including -kmap_types.h is inconinstent and does not allow to provide guard page -debugging for this misfeature. - -Add a common header file which defines the mapping stack size for all -architectures. Will be used when converting architectures over to a -generic kmap_local/atomic implementation. - -The array size is chosen with the following constraints in mind: - - - The deepest nest level in one context is 3 according to code - inspection. - - - The worst case nesting for the upcoming reemptible version would be: - - 2 maps in task context and a fault inside - 2 maps in the fault handler - 3 maps in softirq - 2 maps in interrupt - -So a total of 16 is sufficient and probably overestimated. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/asm-generic/Kbuild | 1 + - include/asm-generic/kmap_size.h | 12 ++++++++++++ - 2 files changed, 13 insertions(+) - create mode 100644 include/asm-generic/kmap_size.h - -diff --git a/include/asm-generic/Kbuild b/include/asm-generic/Kbuild -index d1300c6e0a47..3114a6da7e56 100644 ---- a/include/asm-generic/Kbuild -+++ b/include/asm-generic/Kbuild -@@ -31,6 +31,7 @@ mandatory-y += irq_regs.h - mandatory-y += irq_work.h - mandatory-y += kdebug.h - mandatory-y += kmap_types.h -+mandatory-y += kmap_size.h - mandatory-y += kprobes.h - mandatory-y += linkage.h - mandatory-y += local.h -diff --git a/include/asm-generic/kmap_size.h b/include/asm-generic/kmap_size.h -new file mode 100644 -index 000000000000..9d6c7786a645 ---- /dev/null -+++ b/include/asm-generic/kmap_size.h -@@ -0,0 +1,12 @@ -+/* SPDX-License-Identifier: GPL-2.0 */ -+#ifndef _ASM_GENERIC_KMAP_SIZE_H -+#define _ASM_GENERIC_KMAP_SIZE_H -+ -+/* For debug this provides guard pages between the maps */ -+#ifdef CONFIG_DEBUG_HIGHMEM -+# define KM_MAX_IDX 33 -+#else -+# define KM_MAX_IDX 16 -+#endif -+ -+#endif --- -2.30.2 - diff --git a/debian/patches-rt/0029-highmem-Provide-generic-variant-of-kmap_atomic.patch b/debian/patches-rt/0029-highmem-Provide-generic-variant-of-kmap_atomic.patch deleted file mode 100644 index bbee4941e..000000000 --- a/debian/patches-rt/0029-highmem-Provide-generic-variant-of-kmap_atomic.patch +++ /dev/null @@ -1,346 +0,0 @@ -From 77f4edaa9a4d42e7b7db689b140270ef921bf25a Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:18 +0100 -Subject: [PATCH 029/296] highmem: Provide generic variant of kmap_atomic* -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The kmap_atomic* interfaces in all architectures are pretty much the same -except for post map operations (flush) and pre- and post unmap operations. - -Provide a generic variant for that. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Cc: Andrew Morton <akpm@linux-foundation.org> -Cc: linux-mm@kvack.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/highmem.h | 82 ++++++++++++++++++----- - mm/Kconfig | 3 + - mm/highmem.c | 144 +++++++++++++++++++++++++++++++++++++++- - 3 files changed, 211 insertions(+), 18 deletions(-) - -diff --git a/include/linux/highmem.h b/include/linux/highmem.h -index f5c31338f0a3..f5ecee9c2576 100644 ---- a/include/linux/highmem.h -+++ b/include/linux/highmem.h -@@ -31,9 +31,16 @@ static inline void invalidate_kernel_vmap_range(void *vaddr, int size) - - #include <asm/kmap_types.h> - -+/* -+ * Outside of CONFIG_HIGHMEM to support X86 32bit iomap_atomic() cruft. -+ */ -+#ifdef CONFIG_KMAP_LOCAL -+void *__kmap_local_pfn_prot(unsigned long pfn, pgprot_t prot); -+void *__kmap_local_page_prot(struct page *page, pgprot_t prot); -+void kunmap_local_indexed(void *vaddr); -+#endif -+ - #ifdef CONFIG_HIGHMEM --extern void *kmap_atomic_high_prot(struct page *page, pgprot_t prot); --extern void kunmap_atomic_high(void *kvaddr); - #include <asm/highmem.h> - - #ifndef ARCH_HAS_KMAP_FLUSH_TLB -@@ -81,6 +88,11 @@ static inline void kunmap(struct page *page) - * be used in IRQ contexts, so in some (very limited) cases we need - * it. - */ -+ -+#ifndef CONFIG_KMAP_LOCAL -+void *kmap_atomic_high_prot(struct page *page, pgprot_t prot); -+void kunmap_atomic_high(void *kvaddr); -+ - static inline void *kmap_atomic_prot(struct page *page, pgprot_t prot) - { - preempt_disable(); -@@ -89,7 +101,38 @@ static inline void *kmap_atomic_prot(struct page *page, pgprot_t prot) - return page_address(page); - return kmap_atomic_high_prot(page, prot); - } --#define kmap_atomic(page) kmap_atomic_prot(page, kmap_prot) -+ -+static inline void __kunmap_atomic(void *vaddr) -+{ -+ kunmap_atomic_high(vaddr); -+} -+#else /* !CONFIG_KMAP_LOCAL */ -+ -+static inline void *kmap_atomic_prot(struct page *page, pgprot_t prot) -+{ -+ preempt_disable(); -+ pagefault_disable(); -+ return __kmap_local_page_prot(page, prot); -+} -+ -+static inline void *kmap_atomic_pfn(unsigned long pfn) -+{ -+ preempt_disable(); -+ pagefault_disable(); -+ return __kmap_local_pfn_prot(pfn, kmap_prot); -+} -+ -+static inline void __kunmap_atomic(void *addr) -+{ -+ kunmap_local_indexed(addr); -+} -+ -+#endif /* CONFIG_KMAP_LOCAL */ -+ -+static inline void *kmap_atomic(struct page *page) -+{ -+ return kmap_atomic_prot(page, kmap_prot); -+} - - /* declarations for linux/mm/highmem.c */ - unsigned int nr_free_highpages(void); -@@ -147,25 +190,33 @@ static inline void *kmap_atomic(struct page *page) - pagefault_disable(); - return page_address(page); - } --#define kmap_atomic_prot(page, prot) kmap_atomic(page) - --static inline void kunmap_atomic_high(void *addr) -+static inline void *kmap_atomic_prot(struct page *page, pgprot_t prot) -+{ -+ return kmap_atomic(page); -+} -+ -+static inline void *kmap_atomic_pfn(unsigned long pfn) -+{ -+ return kmap_atomic(pfn_to_page(pfn)); -+} -+ -+static inline void __kunmap_atomic(void *addr) - { - /* - * Mostly nothing to do in the CONFIG_HIGHMEM=n case as kunmap_atomic() -- * handles re-enabling faults + preemption -+ * handles re-enabling faults and preemption - */ - #ifdef ARCH_HAS_FLUSH_ON_KUNMAP - kunmap_flush_on_unmap(addr); - #endif - } - --#define kmap_atomic_pfn(pfn) kmap_atomic(pfn_to_page(pfn)) -- - #define kmap_flush_unused() do {} while(0) - - #endif /* CONFIG_HIGHMEM */ - -+#if !defined(CONFIG_KMAP_LOCAL) - #if defined(CONFIG_HIGHMEM) || defined(CONFIG_X86_32) - - DECLARE_PER_CPU(int, __kmap_atomic_idx); -@@ -196,22 +247,21 @@ static inline void kmap_atomic_idx_pop(void) - __this_cpu_dec(__kmap_atomic_idx); - #endif - } -- -+#endif - #endif - - /* - * Prevent people trying to call kunmap_atomic() as if it were kunmap() - * kunmap_atomic() should get the return value of kmap_atomic, not the page. - */ --#define kunmap_atomic(addr) \ --do { \ -- BUILD_BUG_ON(__same_type((addr), struct page *)); \ -- kunmap_atomic_high(addr); \ -- pagefault_enable(); \ -- preempt_enable(); \ -+#define kunmap_atomic(__addr) \ -+do { \ -+ BUILD_BUG_ON(__same_type((__addr), struct page *)); \ -+ __kunmap_atomic(__addr); \ -+ pagefault_enable(); \ -+ preempt_enable(); \ - } while (0) - -- - /* when CONFIG_HIGHMEM is not set these will be plain clear/copy_page */ - #ifndef clear_user_highpage - static inline void clear_user_highpage(struct page *page, unsigned long vaddr) -diff --git a/mm/Kconfig b/mm/Kconfig -index 390165ffbb0f..8c49d09da214 100644 ---- a/mm/Kconfig -+++ b/mm/Kconfig -@@ -859,4 +859,7 @@ config ARCH_HAS_HUGEPD - config MAPPING_DIRTY_HELPERS - bool - -+config KMAP_LOCAL -+ bool -+ - endmenu -diff --git a/mm/highmem.c b/mm/highmem.c -index 6abfd762eee7..bb4ce13ee7e7 100644 ---- a/mm/highmem.c -+++ b/mm/highmem.c -@@ -31,9 +31,11 @@ - #include <asm/tlbflush.h> - #include <linux/vmalloc.h> - -+#ifndef CONFIG_KMAP_LOCAL - #if defined(CONFIG_HIGHMEM) || defined(CONFIG_X86_32) - DEFINE_PER_CPU(int, __kmap_atomic_idx); - #endif -+#endif - - /* - * Virtual_count is not a pure "count". -@@ -365,9 +367,147 @@ void kunmap_high(struct page *page) - if (need_wakeup) - wake_up(pkmap_map_wait); - } -- - EXPORT_SYMBOL(kunmap_high); --#endif /* CONFIG_HIGHMEM */ -+#endif /* CONFIG_HIGHMEM */ -+ -+#ifdef CONFIG_KMAP_LOCAL -+ -+#include <asm/kmap_size.h> -+ -+static DEFINE_PER_CPU(int, __kmap_local_idx); -+ -+static inline int kmap_local_idx_push(void) -+{ -+ int idx = __this_cpu_inc_return(__kmap_local_idx) - 1; -+ -+ WARN_ON_ONCE(in_irq() && !irqs_disabled()); -+ BUG_ON(idx >= KM_MAX_IDX); -+ return idx; -+} -+ -+static inline int kmap_local_idx(void) -+{ -+ return __this_cpu_read(__kmap_local_idx) - 1; -+} -+ -+static inline void kmap_local_idx_pop(void) -+{ -+ int idx = __this_cpu_dec_return(__kmap_local_idx); -+ -+ BUG_ON(idx < 0); -+} -+ -+#ifndef arch_kmap_local_post_map -+# define arch_kmap_local_post_map(vaddr, pteval) do { } while (0) -+#endif -+#ifndef arch_kmap_local_pre_unmap -+# define arch_kmap_local_pre_unmap(vaddr) do { } while (0) -+#endif -+ -+#ifndef arch_kmap_local_post_unmap -+# define arch_kmap_local_post_unmap(vaddr) do { } while (0) -+#endif -+ -+#ifndef arch_kmap_local_map_idx -+#define arch_kmap_local_map_idx(idx, pfn) kmap_local_calc_idx(idx) -+#endif -+ -+#ifndef arch_kmap_local_unmap_idx -+#define arch_kmap_local_unmap_idx(idx, vaddr) kmap_local_calc_idx(idx) -+#endif -+ -+#ifndef arch_kmap_local_high_get -+static inline void *arch_kmap_local_high_get(struct page *page) -+{ -+ return NULL; -+} -+#endif -+ -+/* Unmap a local mapping which was obtained by kmap_high_get() */ -+static inline void kmap_high_unmap_local(unsigned long vaddr) -+{ -+#ifdef ARCH_NEEDS_KMAP_HIGH_GET -+ if (vaddr >= PKMAP_ADDR(0) && vaddr < PKMAP_ADDR(LAST_PKMAP)) -+ kunmap_high(pte_page(pkmap_page_table[PKMAP_NR(vaddr)])); -+#endif -+} -+ -+static inline int kmap_local_calc_idx(int idx) -+{ -+ return idx + KM_MAX_IDX * smp_processor_id(); -+} -+ -+static pte_t *__kmap_pte; -+ -+static pte_t *kmap_get_pte(void) -+{ -+ if (!__kmap_pte) -+ __kmap_pte = virt_to_kpte(__fix_to_virt(FIX_KMAP_BEGIN)); -+ return __kmap_pte; -+} -+ -+void *__kmap_local_pfn_prot(unsigned long pfn, pgprot_t prot) -+{ -+ pte_t pteval, *kmap_pte = kmap_get_pte(); -+ unsigned long vaddr; -+ int idx; -+ -+ preempt_disable(); -+ idx = arch_kmap_local_map_idx(kmap_local_idx_push(), pfn); -+ vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); -+ BUG_ON(!pte_none(*(kmap_pte - idx))); -+ pteval = pfn_pte(pfn, prot); -+ set_pte_at(&init_mm, vaddr, kmap_pte - idx, pteval); -+ arch_kmap_local_post_map(vaddr, pteval); -+ preempt_enable(); -+ -+ return (void *)vaddr; -+} -+EXPORT_SYMBOL_GPL(__kmap_local_pfn_prot); -+ -+void *__kmap_local_page_prot(struct page *page, pgprot_t prot) -+{ -+ void *kmap; -+ -+ if (!PageHighMem(page)) -+ return page_address(page); -+ -+ /* Try kmap_high_get() if architecture has it enabled */ -+ kmap = arch_kmap_local_high_get(page); -+ if (kmap) -+ return kmap; -+ -+ return __kmap_local_pfn_prot(page_to_pfn(page), prot); -+} -+EXPORT_SYMBOL(__kmap_local_page_prot); -+ -+void kunmap_local_indexed(void *vaddr) -+{ -+ unsigned long addr = (unsigned long) vaddr & PAGE_MASK; -+ pte_t *kmap_pte = kmap_get_pte(); -+ int idx; -+ -+ if (addr < __fix_to_virt(FIX_KMAP_END) || -+ addr > __fix_to_virt(FIX_KMAP_BEGIN)) { -+ WARN_ON_ONCE(addr < PAGE_OFFSET); -+ -+ /* Handle mappings which were obtained by kmap_high_get() */ -+ kmap_high_unmap_local(addr); -+ return; -+ } -+ -+ preempt_disable(); -+ idx = arch_kmap_local_unmap_idx(kmap_local_idx(), addr); -+ WARN_ON_ONCE(addr != __fix_to_virt(FIX_KMAP_BEGIN + idx)); -+ -+ arch_kmap_local_pre_unmap(addr); -+ pte_clear(&init_mm, addr, kmap_pte - idx); -+ arch_kmap_local_post_unmap(addr); -+ kmap_local_idx_pop(); -+ preempt_enable(); -+} -+EXPORT_SYMBOL(kunmap_local_indexed); -+#endif - - #if defined(HASHED_PAGE_VIRTUAL) - --- -2.30.2 - diff --git a/debian/patches-rt/0030-highmem-Make-DEBUG_HIGHMEM-functional.patch b/debian/patches-rt/0030-highmem-Make-DEBUG_HIGHMEM-functional.patch deleted file mode 100644 index 2a813e820..000000000 --- a/debian/patches-rt/0030-highmem-Make-DEBUG_HIGHMEM-functional.patch +++ /dev/null @@ -1,61 +0,0 @@ -From 6b508aa161ea39320ea8ce790a419f34c00b3478 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:19 +0100 -Subject: [PATCH 030/296] highmem: Make DEBUG_HIGHMEM functional -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -For some obscure reason when CONFIG_DEBUG_HIGHMEM is enabled the stack -depth is increased from 20 to 41. But the only thing DEBUG_HIGHMEM does is -to enable a few BUG_ON()'s in the mapping code. - -That's a leftover from the historical mapping code which had fixed entries -for various purposes. DEBUG_HIGHMEM inserted guard mappings between the map -types. But that got all ditched when kmap_atomic() switched to a stack -based map management. Though the WITH_KM_FENCE magic survived without being -functional. All the thing does today is to increase the stack depth. - -Add a working implementation to the generic kmap_local* implementation. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - mm/highmem.c | 14 ++++++++++++-- - 1 file changed, 12 insertions(+), 2 deletions(-) - -diff --git a/mm/highmem.c b/mm/highmem.c -index bb4ce13ee7e7..67d2d5983cb0 100644 ---- a/mm/highmem.c -+++ b/mm/highmem.c -@@ -376,9 +376,19 @@ EXPORT_SYMBOL(kunmap_high); - - static DEFINE_PER_CPU(int, __kmap_local_idx); - -+/* -+ * With DEBUG_HIGHMEM the stack depth is doubled and every second -+ * slot is unused which acts as a guard page -+ */ -+#ifdef CONFIG_DEBUG_HIGHMEM -+# define KM_INCR 2 -+#else -+# define KM_INCR 1 -+#endif -+ - static inline int kmap_local_idx_push(void) - { -- int idx = __this_cpu_inc_return(__kmap_local_idx) - 1; -+ int idx = __this_cpu_add_return(__kmap_local_idx, KM_INCR) - 1; - - WARN_ON_ONCE(in_irq() && !irqs_disabled()); - BUG_ON(idx >= KM_MAX_IDX); -@@ -392,7 +402,7 @@ static inline int kmap_local_idx(void) - - static inline void kmap_local_idx_pop(void) - { -- int idx = __this_cpu_dec_return(__kmap_local_idx); -+ int idx = __this_cpu_sub_return(__kmap_local_idx, KM_INCR); - - BUG_ON(idx < 0); - } --- -2.30.2 - diff --git a/debian/patches-rt/0031-x86-mm-highmem-Use-generic-kmap-atomic-implementatio.patch b/debian/patches-rt/0031-x86-mm-highmem-Use-generic-kmap-atomic-implementatio.patch deleted file mode 100644 index 270e38ba3..000000000 --- a/debian/patches-rt/0031-x86-mm-highmem-Use-generic-kmap-atomic-implementatio.patch +++ /dev/null @@ -1,389 +0,0 @@ -From f692550d19768db25fb7cf96b68ab6ad055d38fe Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:20 +0100 -Subject: [PATCH 031/296] x86/mm/highmem: Use generic kmap atomic - implementation -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Convert X86 to the generic kmap atomic implementation and make the -iomap_atomic() naming convention consistent while at it. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Cc: x86@kernel.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/x86/Kconfig | 1 + - arch/x86/include/asm/fixmap.h | 5 +-- - arch/x86/include/asm/highmem.h | 13 ++++-- - arch/x86/include/asm/iomap.h | 18 ++++---- - arch/x86/include/asm/kmap_types.h | 13 ------ - arch/x86/include/asm/paravirt_types.h | 1 - - arch/x86/mm/highmem_32.c | 59 --------------------------- - arch/x86/mm/init_32.c | 15 ------- - arch/x86/mm/iomap_32.c | 59 +++------------------------ - include/linux/highmem.h | 2 +- - include/linux/io-mapping.h | 2 +- - mm/highmem.c | 2 +- - 12 files changed, 30 insertions(+), 160 deletions(-) - delete mode 100644 arch/x86/include/asm/kmap_types.h - -diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig -index 3a5ecb1039bf..7e1fd20234db 100644 ---- a/arch/x86/Kconfig -+++ b/arch/x86/Kconfig -@@ -15,6 +15,7 @@ config X86_32 - select CLKSRC_I8253 - select CLONE_BACKWARDS - select HAVE_DEBUG_STACKOVERFLOW -+ select KMAP_LOCAL - select MODULES_USE_ELF_REL - select OLD_SIGACTION - select GENERIC_VDSO_32 -diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h -index 77217bd292bd..8eba66a33e39 100644 ---- a/arch/x86/include/asm/fixmap.h -+++ b/arch/x86/include/asm/fixmap.h -@@ -31,7 +31,7 @@ - #include <asm/pgtable_types.h> - #ifdef CONFIG_X86_32 - #include <linux/threads.h> --#include <asm/kmap_types.h> -+#include <asm/kmap_size.h> - #else - #include <uapi/asm/vsyscall.h> - #endif -@@ -94,7 +94,7 @@ enum fixed_addresses { - #endif - #ifdef CONFIG_X86_32 - FIX_KMAP_BEGIN, /* reserved pte's for temporary kernel mappings */ -- FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1, -+ FIX_KMAP_END = FIX_KMAP_BEGIN + (KM_MAX_IDX * NR_CPUS) - 1, - #ifdef CONFIG_PCI_MMCONFIG - FIX_PCIE_MCFG, - #endif -@@ -151,7 +151,6 @@ extern void reserve_top_address(unsigned long reserve); - - extern int fixmaps_set; - --extern pte_t *kmap_pte; - extern pte_t *pkmap_page_table; - - void __native_set_fixmap(enum fixed_addresses idx, pte_t pte); -diff --git a/arch/x86/include/asm/highmem.h b/arch/x86/include/asm/highmem.h -index 0f420b24e0fc..032e020853aa 100644 ---- a/arch/x86/include/asm/highmem.h -+++ b/arch/x86/include/asm/highmem.h -@@ -23,7 +23,6 @@ - - #include <linux/interrupt.h> - #include <linux/threads.h> --#include <asm/kmap_types.h> - #include <asm/tlbflush.h> - #include <asm/paravirt.h> - #include <asm/fixmap.h> -@@ -58,11 +57,17 @@ extern unsigned long highstart_pfn, highend_pfn; - #define PKMAP_NR(virt) ((virt-PKMAP_BASE) >> PAGE_SHIFT) - #define PKMAP_ADDR(nr) (PKMAP_BASE + ((nr) << PAGE_SHIFT)) - --void *kmap_atomic_pfn(unsigned long pfn); --void *kmap_atomic_prot_pfn(unsigned long pfn, pgprot_t prot); -- - #define flush_cache_kmaps() do { } while (0) - -+#define arch_kmap_local_post_map(vaddr, pteval) \ -+ arch_flush_lazy_mmu_mode() -+ -+#define arch_kmap_local_post_unmap(vaddr) \ -+ do { \ -+ flush_tlb_one_kernel((vaddr)); \ -+ arch_flush_lazy_mmu_mode(); \ -+ } while (0) -+ - extern void add_highpages_with_active_regions(int nid, unsigned long start_pfn, - unsigned long end_pfn); - -diff --git a/arch/x86/include/asm/iomap.h b/arch/x86/include/asm/iomap.h -index bacf68c4d70e..0be7a30fd6bc 100644 ---- a/arch/x86/include/asm/iomap.h -+++ b/arch/x86/include/asm/iomap.h -@@ -9,19 +9,21 @@ - #include <linux/fs.h> - #include <linux/mm.h> - #include <linux/uaccess.h> -+#include <linux/highmem.h> - #include <asm/cacheflush.h> - #include <asm/tlbflush.h> - --void __iomem * --iomap_atomic_prot_pfn(unsigned long pfn, pgprot_t prot); -+void __iomem *iomap_atomic_pfn_prot(unsigned long pfn, pgprot_t prot); - --void --iounmap_atomic(void __iomem *kvaddr); -+static inline void iounmap_atomic(void __iomem *vaddr) -+{ -+ kunmap_local_indexed((void __force *)vaddr); -+ pagefault_enable(); -+ preempt_enable(); -+} - --int --iomap_create_wc(resource_size_t base, unsigned long size, pgprot_t *prot); -+int iomap_create_wc(resource_size_t base, unsigned long size, pgprot_t *prot); - --void --iomap_free(resource_size_t base, unsigned long size); -+void iomap_free(resource_size_t base, unsigned long size); - - #endif /* _ASM_X86_IOMAP_H */ -diff --git a/arch/x86/include/asm/kmap_types.h b/arch/x86/include/asm/kmap_types.h -deleted file mode 100644 -index 04ab8266e347..000000000000 ---- a/arch/x86/include/asm/kmap_types.h -+++ /dev/null -@@ -1,13 +0,0 @@ --/* SPDX-License-Identifier: GPL-2.0 */ --#ifndef _ASM_X86_KMAP_TYPES_H --#define _ASM_X86_KMAP_TYPES_H -- --#if defined(CONFIG_X86_32) && defined(CONFIG_DEBUG_HIGHMEM) --#define __WITH_KM_FENCE --#endif -- --#include <asm-generic/kmap_types.h> -- --#undef __WITH_KM_FENCE -- --#endif /* _ASM_X86_KMAP_TYPES_H */ -diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h -index 0fad9f61c76a..b6b02b7c19cc 100644 ---- a/arch/x86/include/asm/paravirt_types.h -+++ b/arch/x86/include/asm/paravirt_types.h -@@ -41,7 +41,6 @@ - #ifndef __ASSEMBLY__ - - #include <asm/desc_defs.h> --#include <asm/kmap_types.h> - #include <asm/pgtable_types.h> - #include <asm/nospec-branch.h> - -diff --git a/arch/x86/mm/highmem_32.c b/arch/x86/mm/highmem_32.c -index 075fe51317b0..2c54b76d8f84 100644 ---- a/arch/x86/mm/highmem_32.c -+++ b/arch/x86/mm/highmem_32.c -@@ -4,65 +4,6 @@ - #include <linux/swap.h> /* for totalram_pages */ - #include <linux/memblock.h> - --void *kmap_atomic_high_prot(struct page *page, pgprot_t prot) --{ -- unsigned long vaddr; -- int idx, type; -- -- type = kmap_atomic_idx_push(); -- idx = type + KM_TYPE_NR*smp_processor_id(); -- vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); -- BUG_ON(!pte_none(*(kmap_pte-idx))); -- set_pte(kmap_pte-idx, mk_pte(page, prot)); -- arch_flush_lazy_mmu_mode(); -- -- return (void *)vaddr; --} --EXPORT_SYMBOL(kmap_atomic_high_prot); -- --/* -- * This is the same as kmap_atomic() but can map memory that doesn't -- * have a struct page associated with it. -- */ --void *kmap_atomic_pfn(unsigned long pfn) --{ -- return kmap_atomic_prot_pfn(pfn, kmap_prot); --} --EXPORT_SYMBOL_GPL(kmap_atomic_pfn); -- --void kunmap_atomic_high(void *kvaddr) --{ -- unsigned long vaddr = (unsigned long) kvaddr & PAGE_MASK; -- -- if (vaddr >= __fix_to_virt(FIX_KMAP_END) && -- vaddr <= __fix_to_virt(FIX_KMAP_BEGIN)) { -- int idx, type; -- -- type = kmap_atomic_idx(); -- idx = type + KM_TYPE_NR * smp_processor_id(); -- --#ifdef CONFIG_DEBUG_HIGHMEM -- WARN_ON_ONCE(vaddr != __fix_to_virt(FIX_KMAP_BEGIN + idx)); --#endif -- /* -- * Force other mappings to Oops if they'll try to access this -- * pte without first remap it. Keeping stale mappings around -- * is a bad idea also, in case the page changes cacheability -- * attributes or becomes a protected page in a hypervisor. -- */ -- kpte_clear_flush(kmap_pte-idx, vaddr); -- kmap_atomic_idx_pop(); -- arch_flush_lazy_mmu_mode(); -- } --#ifdef CONFIG_DEBUG_HIGHMEM -- else { -- BUG_ON(vaddr < PAGE_OFFSET); -- BUG_ON(vaddr >= (unsigned long)high_memory); -- } --#endif --} --EXPORT_SYMBOL(kunmap_atomic_high); -- - void __init set_highmem_pages_init(void) - { - struct zone *zone; -diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c -index 7c055259de3a..da31c2635ee4 100644 ---- a/arch/x86/mm/init_32.c -+++ b/arch/x86/mm/init_32.c -@@ -394,19 +394,6 @@ kernel_physical_mapping_init(unsigned long start, - return last_map_addr; - } - --pte_t *kmap_pte; -- --static void __init kmap_init(void) --{ -- unsigned long kmap_vstart; -- -- /* -- * Cache the first kmap pte: -- */ -- kmap_vstart = __fix_to_virt(FIX_KMAP_BEGIN); -- kmap_pte = virt_to_kpte(kmap_vstart); --} -- - #ifdef CONFIG_HIGHMEM - static void __init permanent_kmaps_init(pgd_t *pgd_base) - { -@@ -712,8 +699,6 @@ void __init paging_init(void) - - __flush_tlb_all(); - -- kmap_init(); -- - /* - * NOTE: at this point the bootmem allocator is fully available. - */ -diff --git a/arch/x86/mm/iomap_32.c b/arch/x86/mm/iomap_32.c -index f60398aeb644..e0a40d7cc66c 100644 ---- a/arch/x86/mm/iomap_32.c -+++ b/arch/x86/mm/iomap_32.c -@@ -44,28 +44,7 @@ void iomap_free(resource_size_t base, unsigned long size) - } - EXPORT_SYMBOL_GPL(iomap_free); - --void *kmap_atomic_prot_pfn(unsigned long pfn, pgprot_t prot) --{ -- unsigned long vaddr; -- int idx, type; -- -- preempt_disable(); -- pagefault_disable(); -- -- type = kmap_atomic_idx_push(); -- idx = type + KM_TYPE_NR * smp_processor_id(); -- vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); -- set_pte(kmap_pte - idx, pfn_pte(pfn, prot)); -- arch_flush_lazy_mmu_mode(); -- -- return (void *)vaddr; --} -- --/* -- * Map 'pfn' using protections 'prot' -- */ --void __iomem * --iomap_atomic_prot_pfn(unsigned long pfn, pgprot_t prot) -+void __iomem *iomap_atomic_pfn_prot(unsigned long pfn, pgprot_t prot) - { - /* - * For non-PAT systems, translate non-WB request to UC- just in -@@ -81,36 +60,8 @@ iomap_atomic_prot_pfn(unsigned long pfn, pgprot_t prot) - /* Filter out unsupported __PAGE_KERNEL* bits: */ - pgprot_val(prot) &= __default_kernel_pte_mask; - -- return (void __force __iomem *) kmap_atomic_prot_pfn(pfn, prot); --} --EXPORT_SYMBOL_GPL(iomap_atomic_prot_pfn); -- --void --iounmap_atomic(void __iomem *kvaddr) --{ -- unsigned long vaddr = (unsigned long) kvaddr & PAGE_MASK; -- -- if (vaddr >= __fix_to_virt(FIX_KMAP_END) && -- vaddr <= __fix_to_virt(FIX_KMAP_BEGIN)) { -- int idx, type; -- -- type = kmap_atomic_idx(); -- idx = type + KM_TYPE_NR * smp_processor_id(); -- --#ifdef CONFIG_DEBUG_HIGHMEM -- WARN_ON_ONCE(vaddr != __fix_to_virt(FIX_KMAP_BEGIN + idx)); --#endif -- /* -- * Force other mappings to Oops if they'll try to access this -- * pte without first remap it. Keeping stale mappings around -- * is a bad idea also, in case the page changes cacheability -- * attributes or becomes a protected page in a hypervisor. -- */ -- kpte_clear_flush(kmap_pte-idx, vaddr); -- kmap_atomic_idx_pop(); -- } -- -- pagefault_enable(); -- preempt_enable(); -+ preempt_disable(); -+ pagefault_disable(); -+ return (void __force __iomem *)__kmap_local_pfn_prot(pfn, prot); - } --EXPORT_SYMBOL_GPL(iounmap_atomic); -+EXPORT_SYMBOL_GPL(iomap_atomic_pfn_prot); -diff --git a/include/linux/highmem.h b/include/linux/highmem.h -index f5ecee9c2576..1222a31be842 100644 ---- a/include/linux/highmem.h -+++ b/include/linux/highmem.h -@@ -217,7 +217,7 @@ static inline void __kunmap_atomic(void *addr) - #endif /* CONFIG_HIGHMEM */ - - #if !defined(CONFIG_KMAP_LOCAL) --#if defined(CONFIG_HIGHMEM) || defined(CONFIG_X86_32) -+#if defined(CONFIG_HIGHMEM) - - DECLARE_PER_CPU(int, __kmap_atomic_idx); - -diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h -index c75e4d3d8833..3b0940be72e9 100644 ---- a/include/linux/io-mapping.h -+++ b/include/linux/io-mapping.h -@@ -69,7 +69,7 @@ io_mapping_map_atomic_wc(struct io_mapping *mapping, - - BUG_ON(offset >= mapping->size); - phys_addr = mapping->base + offset; -- return iomap_atomic_prot_pfn(PHYS_PFN(phys_addr), mapping->prot); -+ return iomap_atomic_pfn_prot(PHYS_PFN(phys_addr), mapping->prot); - } - - static inline void -diff --git a/mm/highmem.c b/mm/highmem.c -index 67d2d5983cb0..77677c6844f7 100644 ---- a/mm/highmem.c -+++ b/mm/highmem.c -@@ -32,7 +32,7 @@ - #include <linux/vmalloc.h> - - #ifndef CONFIG_KMAP_LOCAL --#if defined(CONFIG_HIGHMEM) || defined(CONFIG_X86_32) -+#ifdef CONFIG_HIGHMEM - DEFINE_PER_CPU(int, __kmap_atomic_idx); - #endif - #endif --- -2.30.2 - diff --git a/debian/patches-rt/0032-arc-mm-highmem-Use-generic-kmap-atomic-implementatio.patch b/debian/patches-rt/0032-arc-mm-highmem-Use-generic-kmap-atomic-implementatio.patch deleted file mode 100644 index 00fa124aa..000000000 --- a/debian/patches-rt/0032-arc-mm-highmem-Use-generic-kmap-atomic-implementatio.patch +++ /dev/null @@ -1,212 +0,0 @@ -From 3d4a1e4225aa44f989d3d7f63dad274c9cc8a598 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:21 +0100 -Subject: [PATCH 032/296] arc/mm/highmem: Use generic kmap atomic - implementation -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Adopt the map ordering to match the other architectures and the generic -code. Also make the maximum entries limited and not dependend on the number -of CPUs. With the original implementation did the following calculation: - - nr_slots = mapsize >> PAGE_SHIFT; - -The results in either 512 or 1024 total slots depending on -configuration. The total slots have to be divided by the number of CPUs to -get the number of slots per CPU (former KM_TYPE_NR). ARC supports up to 4k -CPUs, so this just falls apart in random ways depending on the number of -CPUs and the actual kmap (atomic) nesting. The comment in highmem.c: - - * - fixmap anyhow needs a limited number of mappings. So 2M kvaddr == 256 PTE - * slots across NR_CPUS would be more than sufficient (generic code defines - * KM_TYPE_NR as 20). - -is just wrong. KM_TYPE_NR (now KM_MAX_IDX) is the number of slots per CPU -because kmap_local/atomic() needs to support nested mappings (thread, -softirq, interrupt). While KM_MAX_IDX might be overestimated, the above -reasoning is just wrong and clearly the highmem code was never tested with -any system with more than a few CPUs. - -Use the default number of slots and fail the build when it does not -fit. Randomly failing at runtime is not a really good option. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Cc: Vineet Gupta <vgupta@synopsys.com> -Cc: linux-snps-arc@lists.infradead.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/arc/Kconfig | 1 + - arch/arc/include/asm/highmem.h | 26 +++++++++++---- - arch/arc/include/asm/kmap_types.h | 14 -------- - arch/arc/mm/highmem.c | 54 +++---------------------------- - 4 files changed, 26 insertions(+), 69 deletions(-) - delete mode 100644 arch/arc/include/asm/kmap_types.h - -diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig -index 0a89cc9def65..d8804001d550 100644 ---- a/arch/arc/Kconfig -+++ b/arch/arc/Kconfig -@@ -507,6 +507,7 @@ config LINUX_RAM_BASE - config HIGHMEM - bool "High Memory Support" - select ARCH_DISCONTIGMEM_ENABLE -+ select KMAP_LOCAL - help - With ARC 2G:2G address split, only upper 2G is directly addressable by - kernel. Enable this to potentially allow access to rest of 2G and PAE -diff --git a/arch/arc/include/asm/highmem.h b/arch/arc/include/asm/highmem.h -index 6e5eafb3afdd..a6b8e2c352c4 100644 ---- a/arch/arc/include/asm/highmem.h -+++ b/arch/arc/include/asm/highmem.h -@@ -9,17 +9,29 @@ - #ifdef CONFIG_HIGHMEM - - #include <uapi/asm/page.h> --#include <asm/kmap_types.h> -+#include <asm/kmap_size.h> -+ -+#define FIXMAP_SIZE PGDIR_SIZE -+#define PKMAP_SIZE PGDIR_SIZE - - /* start after vmalloc area */ - #define FIXMAP_BASE (PAGE_OFFSET - FIXMAP_SIZE - PKMAP_SIZE) --#define FIXMAP_SIZE PGDIR_SIZE /* only 1 PGD worth */ --#define KM_TYPE_NR ((FIXMAP_SIZE >> PAGE_SHIFT)/NR_CPUS) --#define FIXMAP_ADDR(nr) (FIXMAP_BASE + ((nr) << PAGE_SHIFT)) -+ -+#define FIX_KMAP_SLOTS (KM_MAX_IDX * NR_CPUS) -+#define FIX_KMAP_BEGIN (0UL) -+#define FIX_KMAP_END ((FIX_KMAP_BEGIN + FIX_KMAP_SLOTS) - 1) -+ -+#define FIXADDR_TOP (FIXMAP_BASE + (FIX_KMAP_END << PAGE_SHIFT)) -+ -+/* -+ * This should be converted to the asm-generic version, but of course this -+ * is needlessly different from all other architectures. Sigh - tglx -+ */ -+#define __fix_to_virt(x) (FIXADDR_TOP - ((x) << PAGE_SHIFT)) -+#define __virt_to_fix(x) (((FIXADDR_TOP - ((x) & PAGE_MASK))) >> PAGE_SHIFT) - - /* start after fixmap area */ - #define PKMAP_BASE (FIXMAP_BASE + FIXMAP_SIZE) --#define PKMAP_SIZE PGDIR_SIZE - #define LAST_PKMAP (PKMAP_SIZE >> PAGE_SHIFT) - #define LAST_PKMAP_MASK (LAST_PKMAP - 1) - #define PKMAP_ADDR(nr) (PKMAP_BASE + ((nr) << PAGE_SHIFT)) -@@ -29,11 +41,13 @@ - - extern void kmap_init(void); - -+#define arch_kmap_local_post_unmap(vaddr) \ -+ local_flush_tlb_kernel_range(vaddr, vaddr + PAGE_SIZE) -+ - static inline void flush_cache_kmaps(void) - { - flush_cache_all(); - } -- - #endif - - #endif -diff --git a/arch/arc/include/asm/kmap_types.h b/arch/arc/include/asm/kmap_types.h -deleted file mode 100644 -index fecf7851ec32..000000000000 ---- a/arch/arc/include/asm/kmap_types.h -+++ /dev/null -@@ -1,14 +0,0 @@ --/* SPDX-License-Identifier: GPL-2.0-only */ --/* -- * Copyright (C) 2015 Synopsys, Inc. (www.synopsys.com) -- */ -- --#ifndef _ASM_KMAP_TYPES_H --#define _ASM_KMAP_TYPES_H -- --/* -- * We primarily need to define KM_TYPE_NR here but that in turn -- * is a function of PGDIR_SIZE etc. -- * To avoid circular deps issue, put everything in asm/highmem.h -- */ --#endif -diff --git a/arch/arc/mm/highmem.c b/arch/arc/mm/highmem.c -index 1b9f473c6369..c79912a6b196 100644 ---- a/arch/arc/mm/highmem.c -+++ b/arch/arc/mm/highmem.c -@@ -36,9 +36,8 @@ - * This means each only has 1 PGDIR_SIZE worth of kvaddr mappings, which means - * 2M of kvaddr space for typical config (8K page and 11:8:13 traversal split) - * -- * - fixmap anyhow needs a limited number of mappings. So 2M kvaddr == 256 PTE -- * slots across NR_CPUS would be more than sufficient (generic code defines -- * KM_TYPE_NR as 20). -+ * - The fixed KMAP slots for kmap_local/atomic() require KM_MAX_IDX slots per -+ * CPU. So the number of CPUs sharing a single PTE page is limited. - * - * - pkmap being preemptible, in theory could do with more than 256 concurrent - * mappings. However, generic pkmap code: map_new_virtual(), doesn't traverse -@@ -47,48 +46,6 @@ - */ - - extern pte_t * pkmap_page_table; --static pte_t * fixmap_page_table; -- --void *kmap_atomic_high_prot(struct page *page, pgprot_t prot) --{ -- int idx, cpu_idx; -- unsigned long vaddr; -- -- cpu_idx = kmap_atomic_idx_push(); -- idx = cpu_idx + KM_TYPE_NR * smp_processor_id(); -- vaddr = FIXMAP_ADDR(idx); -- -- set_pte_at(&init_mm, vaddr, fixmap_page_table + idx, -- mk_pte(page, prot)); -- -- return (void *)vaddr; --} --EXPORT_SYMBOL(kmap_atomic_high_prot); -- --void kunmap_atomic_high(void *kv) --{ -- unsigned long kvaddr = (unsigned long)kv; -- -- if (kvaddr >= FIXMAP_BASE && kvaddr < (FIXMAP_BASE + FIXMAP_SIZE)) { -- -- /* -- * Because preemption is disabled, this vaddr can be associated -- * with the current allocated index. -- * But in case of multiple live kmap_atomic(), it still relies on -- * callers to unmap in right order. -- */ -- int cpu_idx = kmap_atomic_idx(); -- int idx = cpu_idx + KM_TYPE_NR * smp_processor_id(); -- -- WARN_ON(kvaddr != FIXMAP_ADDR(idx)); -- -- pte_clear(&init_mm, kvaddr, fixmap_page_table + idx); -- local_flush_tlb_kernel_range(kvaddr, kvaddr + PAGE_SIZE); -- -- kmap_atomic_idx_pop(); -- } --} --EXPORT_SYMBOL(kunmap_atomic_high); - - static noinline pte_t * __init alloc_kmap_pgtable(unsigned long kvaddr) - { -@@ -108,10 +65,9 @@ void __init kmap_init(void) - { - /* Due to recursive include hell, we can't do this in processor.h */ - BUILD_BUG_ON(PAGE_OFFSET < (VMALLOC_END + FIXMAP_SIZE + PKMAP_SIZE)); -+ BUILD_BUG_ON(LAST_PKMAP > PTRS_PER_PTE); -+ BUILD_BUG_ON(FIX_KMAP_SLOTS > PTRS_PER_PTE); - -- BUILD_BUG_ON(KM_TYPE_NR > PTRS_PER_PTE); - pkmap_page_table = alloc_kmap_pgtable(PKMAP_BASE); -- -- BUILD_BUG_ON(LAST_PKMAP > PTRS_PER_PTE); -- fixmap_page_table = alloc_kmap_pgtable(FIXMAP_BASE); -+ alloc_kmap_pgtable(FIXMAP_BASE); - } --- -2.30.2 - diff --git a/debian/patches-rt/0033-ARM-highmem-Switch-to-generic-kmap-atomic.patch b/debian/patches-rt/0033-ARM-highmem-Switch-to-generic-kmap-atomic.patch deleted file mode 100644 index 8c2ef6ade..000000000 --- a/debian/patches-rt/0033-ARM-highmem-Switch-to-generic-kmap-atomic.patch +++ /dev/null @@ -1,271 +0,0 @@ -From 2702087141e730ec0bd12b16ccf55c9f766d682f Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:22 +0100 -Subject: [PATCH 033/296] ARM: highmem: Switch to generic kmap atomic -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -No reason having the same code in every architecture. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Cc: Russell King <linux@armlinux.org.uk> -Cc: Arnd Bergmann <arnd@arndb.de> -Cc: linux-arm-kernel@lists.infradead.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/arm/Kconfig | 1 + - arch/arm/include/asm/fixmap.h | 4 +- - arch/arm/include/asm/highmem.h | 34 ++++++--- - arch/arm/include/asm/kmap_types.h | 10 --- - arch/arm/mm/Makefile | 1 - - arch/arm/mm/highmem.c | 121 ------------------------------ - 6 files changed, 27 insertions(+), 144 deletions(-) - delete mode 100644 arch/arm/include/asm/kmap_types.h - delete mode 100644 arch/arm/mm/highmem.c - -diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig -index 002e0cf025f5..4708ede3b826 100644 ---- a/arch/arm/Kconfig -+++ b/arch/arm/Kconfig -@@ -1499,6 +1499,7 @@ config HAVE_ARCH_PFN_VALID - config HIGHMEM - bool "High Memory Support" - depends on MMU -+ select KMAP_LOCAL - help - The address space of ARM processors is only 4 Gigabytes large - and it has to accommodate user address space, kernel address -diff --git a/arch/arm/include/asm/fixmap.h b/arch/arm/include/asm/fixmap.h -index fc56fc3e1931..c279a8a463a2 100644 ---- a/arch/arm/include/asm/fixmap.h -+++ b/arch/arm/include/asm/fixmap.h -@@ -7,14 +7,14 @@ - #define FIXADDR_TOP (FIXADDR_END - PAGE_SIZE) - - #include <linux/pgtable.h> --#include <asm/kmap_types.h> -+#include <asm/kmap_size.h> - - enum fixed_addresses { - FIX_EARLYCON_MEM_BASE, - __end_of_permanent_fixed_addresses, - - FIX_KMAP_BEGIN = __end_of_permanent_fixed_addresses, -- FIX_KMAP_END = FIX_KMAP_BEGIN + (KM_TYPE_NR * NR_CPUS) - 1, -+ FIX_KMAP_END = FIX_KMAP_BEGIN + (KM_MAX_IDX * NR_CPUS) - 1, - - /* Support writing RO kernel text via kprobes, jump labels, etc. */ - FIX_TEXT_POKE0, -diff --git a/arch/arm/include/asm/highmem.h b/arch/arm/include/asm/highmem.h -index 31811be38d78..b22dffa8c7eb 100644 ---- a/arch/arm/include/asm/highmem.h -+++ b/arch/arm/include/asm/highmem.h -@@ -2,7 +2,8 @@ - #ifndef _ASM_HIGHMEM_H - #define _ASM_HIGHMEM_H - --#include <asm/kmap_types.h> -+#include <asm/kmap_size.h> -+#include <asm/fixmap.h> - - #define PKMAP_BASE (PAGE_OFFSET - PMD_SIZE) - #define LAST_PKMAP PTRS_PER_PTE -@@ -46,19 +47,32 @@ extern pte_t *pkmap_page_table; - - #ifdef ARCH_NEEDS_KMAP_HIGH_GET - extern void *kmap_high_get(struct page *page); --#else -+ -+static inline void *arch_kmap_local_high_get(struct page *page) -+{ -+ if (IS_ENABLED(CONFIG_DEBUG_HIGHMEM) && !cache_is_vivt()) -+ return NULL; -+ return kmap_high_get(page); -+} -+#define arch_kmap_local_high_get arch_kmap_local_high_get -+ -+#else /* ARCH_NEEDS_KMAP_HIGH_GET */ - static inline void *kmap_high_get(struct page *page) - { - return NULL; - } --#endif -+#endif /* !ARCH_NEEDS_KMAP_HIGH_GET */ - --/* -- * The following functions are already defined by <linux/highmem.h> -- * when CONFIG_HIGHMEM is not set. -- */ --#ifdef CONFIG_HIGHMEM --extern void *kmap_atomic_pfn(unsigned long pfn); --#endif -+#define arch_kmap_local_post_map(vaddr, pteval) \ -+ local_flush_tlb_kernel_page(vaddr) -+ -+#define arch_kmap_local_pre_unmap(vaddr) \ -+do { \ -+ if (cache_is_vivt()) \ -+ __cpuc_flush_dcache_area((void *)vaddr, PAGE_SIZE); \ -+} while (0) -+ -+#define arch_kmap_local_post_unmap(vaddr) \ -+ local_flush_tlb_kernel_page(vaddr) - - #endif -diff --git a/arch/arm/include/asm/kmap_types.h b/arch/arm/include/asm/kmap_types.h -deleted file mode 100644 -index 5590940ee43d..000000000000 ---- a/arch/arm/include/asm/kmap_types.h -+++ /dev/null -@@ -1,10 +0,0 @@ --/* SPDX-License-Identifier: GPL-2.0 */ --#ifndef __ARM_KMAP_TYPES_H --#define __ARM_KMAP_TYPES_H -- --/* -- * This is the "bare minimum". AIO seems to require this. -- */ --#define KM_TYPE_NR 16 -- --#endif -diff --git a/arch/arm/mm/Makefile b/arch/arm/mm/Makefile -index 7cb1699fbfc4..c4ce477c5261 100644 ---- a/arch/arm/mm/Makefile -+++ b/arch/arm/mm/Makefile -@@ -19,7 +19,6 @@ obj-$(CONFIG_MODULES) += proc-syms.o - obj-$(CONFIG_DEBUG_VIRTUAL) += physaddr.o - - obj-$(CONFIG_ALIGNMENT_TRAP) += alignment.o --obj-$(CONFIG_HIGHMEM) += highmem.o - obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o - obj-$(CONFIG_ARM_PV_FIXUP) += pv-fixup-asm.o - -diff --git a/arch/arm/mm/highmem.c b/arch/arm/mm/highmem.c -deleted file mode 100644 -index 187fab227b50..000000000000 ---- a/arch/arm/mm/highmem.c -+++ /dev/null -@@ -1,121 +0,0 @@ --// SPDX-License-Identifier: GPL-2.0-only --/* -- * arch/arm/mm/highmem.c -- ARM highmem support -- * -- * Author: Nicolas Pitre -- * Created: september 8, 2008 -- * Copyright: Marvell Semiconductors Inc. -- */ -- --#include <linux/module.h> --#include <linux/highmem.h> --#include <linux/interrupt.h> --#include <asm/fixmap.h> --#include <asm/cacheflush.h> --#include <asm/tlbflush.h> --#include "mm.h" -- --static inline void set_fixmap_pte(int idx, pte_t pte) --{ -- unsigned long vaddr = __fix_to_virt(idx); -- pte_t *ptep = virt_to_kpte(vaddr); -- -- set_pte_ext(ptep, pte, 0); -- local_flush_tlb_kernel_page(vaddr); --} -- --static inline pte_t get_fixmap_pte(unsigned long vaddr) --{ -- pte_t *ptep = virt_to_kpte(vaddr); -- -- return *ptep; --} -- --void *kmap_atomic_high_prot(struct page *page, pgprot_t prot) --{ -- unsigned int idx; -- unsigned long vaddr; -- void *kmap; -- int type; -- --#ifdef CONFIG_DEBUG_HIGHMEM -- /* -- * There is no cache coherency issue when non VIVT, so force the -- * dedicated kmap usage for better debugging purposes in that case. -- */ -- if (!cache_is_vivt()) -- kmap = NULL; -- else --#endif -- kmap = kmap_high_get(page); -- if (kmap) -- return kmap; -- -- type = kmap_atomic_idx_push(); -- -- idx = FIX_KMAP_BEGIN + type + KM_TYPE_NR * smp_processor_id(); -- vaddr = __fix_to_virt(idx); --#ifdef CONFIG_DEBUG_HIGHMEM -- /* -- * With debugging enabled, kunmap_atomic forces that entry to 0. -- * Make sure it was indeed properly unmapped. -- */ -- BUG_ON(!pte_none(get_fixmap_pte(vaddr))); --#endif -- /* -- * When debugging is off, kunmap_atomic leaves the previous mapping -- * in place, so the contained TLB flush ensures the TLB is updated -- * with the new mapping. -- */ -- set_fixmap_pte(idx, mk_pte(page, prot)); -- -- return (void *)vaddr; --} --EXPORT_SYMBOL(kmap_atomic_high_prot); -- --void kunmap_atomic_high(void *kvaddr) --{ -- unsigned long vaddr = (unsigned long) kvaddr & PAGE_MASK; -- int idx, type; -- -- if (kvaddr >= (void *)FIXADDR_START) { -- type = kmap_atomic_idx(); -- idx = FIX_KMAP_BEGIN + type + KM_TYPE_NR * smp_processor_id(); -- -- if (cache_is_vivt()) -- __cpuc_flush_dcache_area((void *)vaddr, PAGE_SIZE); --#ifdef CONFIG_DEBUG_HIGHMEM -- BUG_ON(vaddr != __fix_to_virt(idx)); -- set_fixmap_pte(idx, __pte(0)); --#else -- (void) idx; /* to kill a warning */ --#endif -- kmap_atomic_idx_pop(); -- } else if (vaddr >= PKMAP_ADDR(0) && vaddr < PKMAP_ADDR(LAST_PKMAP)) { -- /* this address was obtained through kmap_high_get() */ -- kunmap_high(pte_page(pkmap_page_table[PKMAP_NR(vaddr)])); -- } --} --EXPORT_SYMBOL(kunmap_atomic_high); -- --void *kmap_atomic_pfn(unsigned long pfn) --{ -- unsigned long vaddr; -- int idx, type; -- struct page *page = pfn_to_page(pfn); -- -- preempt_disable(); -- pagefault_disable(); -- if (!PageHighMem(page)) -- return page_address(page); -- -- type = kmap_atomic_idx_push(); -- idx = FIX_KMAP_BEGIN + type + KM_TYPE_NR * smp_processor_id(); -- vaddr = __fix_to_virt(idx); --#ifdef CONFIG_DEBUG_HIGHMEM -- BUG_ON(!pte_none(get_fixmap_pte(vaddr))); --#endif -- set_fixmap_pte(idx, pfn_pte(pfn, kmap_prot)); -- -- return (void *)vaddr; --} --- -2.30.2 - diff --git a/debian/patches-rt/0034-csky-mm-highmem-Switch-to-generic-kmap-atomic.patch b/debian/patches-rt/0034-csky-mm-highmem-Switch-to-generic-kmap-atomic.patch deleted file mode 100644 index d5e53dd05..000000000 --- a/debian/patches-rt/0034-csky-mm-highmem-Switch-to-generic-kmap-atomic.patch +++ /dev/null @@ -1,179 +0,0 @@ -From 1d66c3f3510f323e10a087befba3587b8eae85fa Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:23 +0100 -Subject: [PATCH 034/296] csky/mm/highmem: Switch to generic kmap atomic -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -No reason having the same code in every architecture. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Cc: linux-csky@vger.kernel.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/csky/Kconfig | 1 + - arch/csky/include/asm/fixmap.h | 4 +- - arch/csky/include/asm/highmem.h | 6 ++- - arch/csky/mm/highmem.c | 75 +-------------------------------- - 4 files changed, 8 insertions(+), 78 deletions(-) - -diff --git a/arch/csky/Kconfig b/arch/csky/Kconfig -index 7bf0a617e94c..c9f2533cc53d 100644 ---- a/arch/csky/Kconfig -+++ b/arch/csky/Kconfig -@@ -286,6 +286,7 @@ config NR_CPUS - config HIGHMEM - bool "High Memory Support" - depends on !CPU_CK610 -+ select KMAP_LOCAL - default y - - config FORCE_MAX_ZONEORDER -diff --git a/arch/csky/include/asm/fixmap.h b/arch/csky/include/asm/fixmap.h -index 81f9477d5330..4b589cc20900 100644 ---- a/arch/csky/include/asm/fixmap.h -+++ b/arch/csky/include/asm/fixmap.h -@@ -8,7 +8,7 @@ - #include <asm/memory.h> - #ifdef CONFIG_HIGHMEM - #include <linux/threads.h> --#include <asm/kmap_types.h> -+#include <asm/kmap_size.h> - #endif - - enum fixed_addresses { -@@ -17,7 +17,7 @@ enum fixed_addresses { - #endif - #ifdef CONFIG_HIGHMEM - FIX_KMAP_BEGIN, -- FIX_KMAP_END = FIX_KMAP_BEGIN + (KM_TYPE_NR * NR_CPUS) - 1, -+ FIX_KMAP_END = FIX_KMAP_BEGIN + (KM_MAX_IDX * NR_CPUS) - 1, - #endif - __end_of_fixed_addresses - }; -diff --git a/arch/csky/include/asm/highmem.h b/arch/csky/include/asm/highmem.h -index 14645e3d5cd5..1f4ed3f4c0d9 100644 ---- a/arch/csky/include/asm/highmem.h -+++ b/arch/csky/include/asm/highmem.h -@@ -9,7 +9,7 @@ - #include <linux/init.h> - #include <linux/interrupt.h> - #include <linux/uaccess.h> --#include <asm/kmap_types.h> -+#include <asm/kmap_size.h> - #include <asm/cache.h> - - /* undef for production */ -@@ -32,10 +32,12 @@ extern pte_t *pkmap_page_table; - - #define ARCH_HAS_KMAP_FLUSH_TLB - extern void kmap_flush_tlb(unsigned long addr); --extern void *kmap_atomic_pfn(unsigned long pfn); - - #define flush_cache_kmaps() do {} while (0) - -+#define arch_kmap_local_post_map(vaddr, pteval) kmap_flush_tlb(vaddr) -+#define arch_kmap_local_post_unmap(vaddr) kmap_flush_tlb(vaddr) -+ - extern void kmap_init(void); - - #endif /* __KERNEL__ */ -diff --git a/arch/csky/mm/highmem.c b/arch/csky/mm/highmem.c -index 89c10800a002..4161df3c6c15 100644 ---- a/arch/csky/mm/highmem.c -+++ b/arch/csky/mm/highmem.c -@@ -9,8 +9,6 @@ - #include <asm/tlbflush.h> - #include <asm/cacheflush.h> - --static pte_t *kmap_pte; -- - unsigned long highstart_pfn, highend_pfn; - - void kmap_flush_tlb(unsigned long addr) -@@ -19,67 +17,7 @@ void kmap_flush_tlb(unsigned long addr) - } - EXPORT_SYMBOL(kmap_flush_tlb); - --void *kmap_atomic_high_prot(struct page *page, pgprot_t prot) --{ -- unsigned long vaddr; -- int idx, type; -- -- type = kmap_atomic_idx_push(); -- idx = type + KM_TYPE_NR*smp_processor_id(); -- vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); --#ifdef CONFIG_DEBUG_HIGHMEM -- BUG_ON(!pte_none(*(kmap_pte - idx))); --#endif -- set_pte(kmap_pte-idx, mk_pte(page, prot)); -- flush_tlb_one((unsigned long)vaddr); -- -- return (void *)vaddr; --} --EXPORT_SYMBOL(kmap_atomic_high_prot); -- --void kunmap_atomic_high(void *kvaddr) --{ -- unsigned long vaddr = (unsigned long) kvaddr & PAGE_MASK; -- int idx; -- -- if (vaddr < FIXADDR_START) -- return; -- --#ifdef CONFIG_DEBUG_HIGHMEM -- idx = KM_TYPE_NR*smp_processor_id() + kmap_atomic_idx(); -- -- BUG_ON(vaddr != __fix_to_virt(FIX_KMAP_BEGIN + idx)); -- -- pte_clear(&init_mm, vaddr, kmap_pte - idx); -- flush_tlb_one(vaddr); --#else -- (void) idx; /* to kill a warning */ --#endif -- kmap_atomic_idx_pop(); --} --EXPORT_SYMBOL(kunmap_atomic_high); -- --/* -- * This is the same as kmap_atomic() but can map memory that doesn't -- * have a struct page associated with it. -- */ --void *kmap_atomic_pfn(unsigned long pfn) --{ -- unsigned long vaddr; -- int idx, type; -- -- pagefault_disable(); -- -- type = kmap_atomic_idx_push(); -- idx = type + KM_TYPE_NR*smp_processor_id(); -- vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); -- set_pte(kmap_pte-idx, pfn_pte(pfn, PAGE_KERNEL)); -- flush_tlb_one(vaddr); -- -- return (void *) vaddr; --} -- --static void __init kmap_pages_init(void) -+void __init kmap_init(void) - { - unsigned long vaddr; - pgd_t *pgd; -@@ -96,14 +34,3 @@ static void __init kmap_pages_init(void) - pte = pte_offset_kernel(pmd, vaddr); - pkmap_page_table = pte; - } -- --void __init kmap_init(void) --{ -- unsigned long vaddr; -- -- kmap_pages_init(); -- -- vaddr = __fix_to_virt(FIX_KMAP_BEGIN); -- -- kmap_pte = pte_offset_kernel((pmd_t *)pgd_offset_k(vaddr), vaddr); --} --- -2.30.2 - diff --git a/debian/patches-rt/0035-microblaze-mm-highmem-Switch-to-generic-kmap-atomic.patch b/debian/patches-rt/0035-microblaze-mm-highmem-Switch-to-generic-kmap-atomic.patch deleted file mode 100644 index 4fcb9949d..000000000 --- a/debian/patches-rt/0035-microblaze-mm-highmem-Switch-to-generic-kmap-atomic.patch +++ /dev/null @@ -1,197 +0,0 @@ -From 7085d15edefaf8156d1b6ebec4fd36d4b8bdc8ec Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:24 +0100 -Subject: [PATCH 035/296] microblaze/mm/highmem: Switch to generic kmap atomic -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -No reason having the same code in every architecture. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Cc: Michal Simek <monstr@monstr.eu> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/microblaze/Kconfig | 1 + - arch/microblaze/include/asm/fixmap.h | 4 +- - arch/microblaze/include/asm/highmem.h | 6 ++- - arch/microblaze/mm/Makefile | 1 - - arch/microblaze/mm/highmem.c | 78 --------------------------- - arch/microblaze/mm/init.c | 6 --- - 6 files changed, 8 insertions(+), 88 deletions(-) - delete mode 100644 arch/microblaze/mm/highmem.c - -diff --git a/arch/microblaze/Kconfig b/arch/microblaze/Kconfig -index 33925ffed68f..7f6ca0ab4f81 100644 ---- a/arch/microblaze/Kconfig -+++ b/arch/microblaze/Kconfig -@@ -155,6 +155,7 @@ config XILINX_UNCACHED_SHADOW - config HIGHMEM - bool "High memory support" - depends on MMU -+ select KMAP_LOCAL - help - The address space of Microblaze processors is only 4 Gigabytes large - and it has to accommodate user address space, kernel address -diff --git a/arch/microblaze/include/asm/fixmap.h b/arch/microblaze/include/asm/fixmap.h -index 0379ce5229e3..e6e9288bff76 100644 ---- a/arch/microblaze/include/asm/fixmap.h -+++ b/arch/microblaze/include/asm/fixmap.h -@@ -20,7 +20,7 @@ - #include <asm/page.h> - #ifdef CONFIG_HIGHMEM - #include <linux/threads.h> --#include <asm/kmap_types.h> -+#include <asm/kmap_size.h> - #endif - - #define FIXADDR_TOP ((unsigned long)(-PAGE_SIZE)) -@@ -47,7 +47,7 @@ enum fixed_addresses { - FIX_HOLE, - #ifdef CONFIG_HIGHMEM - FIX_KMAP_BEGIN, /* reserved pte's for temporary kernel mappings */ -- FIX_KMAP_END = FIX_KMAP_BEGIN + (KM_TYPE_NR * num_possible_cpus()) - 1, -+ FIX_KMAP_END = FIX_KMAP_BEGIN + (KM_MAX_IDX * num_possible_cpus()) - 1, - #endif - __end_of_fixed_addresses - }; -diff --git a/arch/microblaze/include/asm/highmem.h b/arch/microblaze/include/asm/highmem.h -index 284ca8fb54c1..4418633fb163 100644 ---- a/arch/microblaze/include/asm/highmem.h -+++ b/arch/microblaze/include/asm/highmem.h -@@ -25,7 +25,6 @@ - #include <linux/uaccess.h> - #include <asm/fixmap.h> - --extern pte_t *kmap_pte; - extern pte_t *pkmap_page_table; - - /* -@@ -52,6 +51,11 @@ extern pte_t *pkmap_page_table; - - #define flush_cache_kmaps() { flush_icache(); flush_dcache(); } - -+#define arch_kmap_local_post_map(vaddr, pteval) \ -+ local_flush_tlb_page(NULL, vaddr); -+#define arch_kmap_local_post_unmap(vaddr) \ -+ local_flush_tlb_page(NULL, vaddr); -+ - #endif /* __KERNEL__ */ - - #endif /* _ASM_HIGHMEM_H */ -diff --git a/arch/microblaze/mm/Makefile b/arch/microblaze/mm/Makefile -index 1b16875cea70..8ced71100047 100644 ---- a/arch/microblaze/mm/Makefile -+++ b/arch/microblaze/mm/Makefile -@@ -6,4 +6,3 @@ - obj-y := consistent.o init.o - - obj-$(CONFIG_MMU) += pgtable.o mmu_context.o fault.o --obj-$(CONFIG_HIGHMEM) += highmem.o -diff --git a/arch/microblaze/mm/highmem.c b/arch/microblaze/mm/highmem.c -deleted file mode 100644 -index 92e0890416c9..000000000000 ---- a/arch/microblaze/mm/highmem.c -+++ /dev/null -@@ -1,78 +0,0 @@ --// SPDX-License-Identifier: GPL-2.0 --/* -- * highmem.c: virtual kernel memory mappings for high memory -- * -- * PowerPC version, stolen from the i386 version. -- * -- * Used in CONFIG_HIGHMEM systems for memory pages which -- * are not addressable by direct kernel virtual addresses. -- * -- * Copyright (C) 1999 Gerhard Wichert, Siemens AG -- * Gerhard.Wichert@pdb.siemens.de -- * -- * -- * Redesigned the x86 32-bit VM architecture to deal with -- * up to 16 Terrabyte physical memory. With current x86 CPUs -- * we now support up to 64 Gigabytes physical RAM. -- * -- * Copyright (C) 1999 Ingo Molnar <mingo@redhat.com> -- * -- * Reworked for PowerPC by various contributors. Moved from -- * highmem.h by Benjamin Herrenschmidt (c) 2009 IBM Corp. -- */ -- --#include <linux/export.h> --#include <linux/highmem.h> -- --/* -- * The use of kmap_atomic/kunmap_atomic is discouraged - kmap/kunmap -- * gives a more generic (and caching) interface. But kmap_atomic can -- * be used in IRQ contexts, so in some (very limited) cases we need -- * it. -- */ --#include <asm/tlbflush.h> -- --void *kmap_atomic_high_prot(struct page *page, pgprot_t prot) --{ -- -- unsigned long vaddr; -- int idx, type; -- -- type = kmap_atomic_idx_push(); -- idx = type + KM_TYPE_NR*smp_processor_id(); -- vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); --#ifdef CONFIG_DEBUG_HIGHMEM -- BUG_ON(!pte_none(*(kmap_pte-idx))); --#endif -- set_pte_at(&init_mm, vaddr, kmap_pte-idx, mk_pte(page, prot)); -- local_flush_tlb_page(NULL, vaddr); -- -- return (void *) vaddr; --} --EXPORT_SYMBOL(kmap_atomic_high_prot); -- --void kunmap_atomic_high(void *kvaddr) --{ -- unsigned long vaddr = (unsigned long) kvaddr & PAGE_MASK; -- int type; -- unsigned int idx; -- -- if (vaddr < __fix_to_virt(FIX_KMAP_END)) -- return; -- -- type = kmap_atomic_idx(); -- -- idx = type + KM_TYPE_NR * smp_processor_id(); --#ifdef CONFIG_DEBUG_HIGHMEM -- BUG_ON(vaddr != __fix_to_virt(FIX_KMAP_BEGIN + idx)); --#endif -- /* -- * force other mappings to Oops if they'll try to access -- * this pte without first remap it -- */ -- pte_clear(&init_mm, vaddr, kmap_pte-idx); -- local_flush_tlb_page(NULL, vaddr); -- -- kmap_atomic_idx_pop(); --} --EXPORT_SYMBOL(kunmap_atomic_high); -diff --git a/arch/microblaze/mm/init.c b/arch/microblaze/mm/init.c -index 45da639bd22c..1f4b5b34e600 100644 ---- a/arch/microblaze/mm/init.c -+++ b/arch/microblaze/mm/init.c -@@ -49,17 +49,11 @@ unsigned long lowmem_size; - EXPORT_SYMBOL(min_low_pfn); - EXPORT_SYMBOL(max_low_pfn); - --#ifdef CONFIG_HIGHMEM --pte_t *kmap_pte; --EXPORT_SYMBOL(kmap_pte); -- - static void __init highmem_init(void) - { - pr_debug("%x\n", (u32)PKMAP_BASE); - map_page(PKMAP_BASE, 0, 0); /* XXX gross */ - pkmap_page_table = virt_to_kpte(PKMAP_BASE); -- -- kmap_pte = virt_to_kpte(__fix_to_virt(FIX_KMAP_BEGIN)); - } - - static void highmem_setup(void) --- -2.30.2 - diff --git a/debian/patches-rt/0036-mips-mm-highmem-Switch-to-generic-kmap-atomic.patch b/debian/patches-rt/0036-mips-mm-highmem-Switch-to-generic-kmap-atomic.patch deleted file mode 100644 index e006f57c0..000000000 --- a/debian/patches-rt/0036-mips-mm-highmem-Switch-to-generic-kmap-atomic.patch +++ /dev/null @@ -1,219 +0,0 @@ -From 154f897521e8207ffd3122e6816af46be4ddbb81 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:25 +0100 -Subject: [PATCH 036/296] mips/mm/highmem: Switch to generic kmap atomic -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -No reason having the same code in every architecture - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> -Cc: linux-mips@vger.kernel.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/mips/Kconfig | 1 + - arch/mips/include/asm/fixmap.h | 4 +- - arch/mips/include/asm/highmem.h | 6 +-- - arch/mips/include/asm/kmap_types.h | 13 ----- - arch/mips/mm/highmem.c | 77 ------------------------------ - arch/mips/mm/init.c | 4 -- - 6 files changed, 6 insertions(+), 99 deletions(-) - delete mode 100644 arch/mips/include/asm/kmap_types.h - -diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig -index 2000bb2b0220..6b762bebff33 100644 ---- a/arch/mips/Kconfig -+++ b/arch/mips/Kconfig -@@ -2719,6 +2719,7 @@ config WAR_MIPS34K_MISSED_ITLB - config HIGHMEM - bool "High Memory Support" - depends on 32BIT && CPU_SUPPORTS_HIGHMEM && SYS_SUPPORTS_HIGHMEM && !CPU_MIPS32_3_5_EVA -+ select KMAP_LOCAL - - config CPU_SUPPORTS_HIGHMEM - bool -diff --git a/arch/mips/include/asm/fixmap.h b/arch/mips/include/asm/fixmap.h -index 743535be7528..beea14761cef 100644 ---- a/arch/mips/include/asm/fixmap.h -+++ b/arch/mips/include/asm/fixmap.h -@@ -17,7 +17,7 @@ - #include <spaces.h> - #ifdef CONFIG_HIGHMEM - #include <linux/threads.h> --#include <asm/kmap_types.h> -+#include <asm/kmap_size.h> - #endif - - /* -@@ -52,7 +52,7 @@ enum fixed_addresses { - #ifdef CONFIG_HIGHMEM - /* reserved pte's for temporary kernel mappings */ - FIX_KMAP_BEGIN = FIX_CMAP_END + 1, -- FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1, -+ FIX_KMAP_END = FIX_KMAP_BEGIN + (KM_MAX_IDX * NR_CPUS) - 1, - #endif - __end_of_fixed_addresses - }; -diff --git a/arch/mips/include/asm/highmem.h b/arch/mips/include/asm/highmem.h -index f1f788b57166..19edf8e69971 100644 ---- a/arch/mips/include/asm/highmem.h -+++ b/arch/mips/include/asm/highmem.h -@@ -24,7 +24,7 @@ - #include <linux/interrupt.h> - #include <linux/uaccess.h> - #include <asm/cpu-features.h> --#include <asm/kmap_types.h> -+#include <asm/kmap_size.h> - - /* declarations for highmem.c */ - extern unsigned long highstart_pfn, highend_pfn; -@@ -48,11 +48,11 @@ extern pte_t *pkmap_page_table; - - #define ARCH_HAS_KMAP_FLUSH_TLB - extern void kmap_flush_tlb(unsigned long addr); --extern void *kmap_atomic_pfn(unsigned long pfn); - - #define flush_cache_kmaps() BUG_ON(cpu_has_dc_aliases) - --extern void kmap_init(void); -+#define arch_kmap_local_post_map(vaddr, pteval) local_flush_tlb_one(vaddr) -+#define arch_kmap_local_post_unmap(vaddr) local_flush_tlb_one(vaddr) - - #endif /* __KERNEL__ */ - -diff --git a/arch/mips/include/asm/kmap_types.h b/arch/mips/include/asm/kmap_types.h -deleted file mode 100644 -index 16665dc2431b..000000000000 ---- a/arch/mips/include/asm/kmap_types.h -+++ /dev/null -@@ -1,13 +0,0 @@ --/* SPDX-License-Identifier: GPL-2.0 */ --#ifndef _ASM_KMAP_TYPES_H --#define _ASM_KMAP_TYPES_H -- --#ifdef CONFIG_DEBUG_HIGHMEM --#define __WITH_KM_FENCE --#endif -- --#include <asm-generic/kmap_types.h> -- --#undef __WITH_KM_FENCE -- --#endif -diff --git a/arch/mips/mm/highmem.c b/arch/mips/mm/highmem.c -index 5fec7f45d79a..57e2f08f00d0 100644 ---- a/arch/mips/mm/highmem.c -+++ b/arch/mips/mm/highmem.c -@@ -8,8 +8,6 @@ - #include <asm/fixmap.h> - #include <asm/tlbflush.h> - --static pte_t *kmap_pte; -- - unsigned long highstart_pfn, highend_pfn; - - void kmap_flush_tlb(unsigned long addr) -@@ -17,78 +15,3 @@ void kmap_flush_tlb(unsigned long addr) - flush_tlb_one(addr); - } - EXPORT_SYMBOL(kmap_flush_tlb); -- --void *kmap_atomic_high_prot(struct page *page, pgprot_t prot) --{ -- unsigned long vaddr; -- int idx, type; -- -- type = kmap_atomic_idx_push(); -- idx = type + KM_TYPE_NR*smp_processor_id(); -- vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); --#ifdef CONFIG_DEBUG_HIGHMEM -- BUG_ON(!pte_none(*(kmap_pte - idx))); --#endif -- set_pte(kmap_pte-idx, mk_pte(page, prot)); -- local_flush_tlb_one((unsigned long)vaddr); -- -- return (void*) vaddr; --} --EXPORT_SYMBOL(kmap_atomic_high_prot); -- --void kunmap_atomic_high(void *kvaddr) --{ -- unsigned long vaddr = (unsigned long) kvaddr & PAGE_MASK; -- int type __maybe_unused; -- -- if (vaddr < FIXADDR_START) -- return; -- -- type = kmap_atomic_idx(); --#ifdef CONFIG_DEBUG_HIGHMEM -- { -- int idx = type + KM_TYPE_NR * smp_processor_id(); -- -- BUG_ON(vaddr != __fix_to_virt(FIX_KMAP_BEGIN + idx)); -- -- /* -- * force other mappings to Oops if they'll try to access -- * this pte without first remap it -- */ -- pte_clear(&init_mm, vaddr, kmap_pte-idx); -- local_flush_tlb_one(vaddr); -- } --#endif -- kmap_atomic_idx_pop(); --} --EXPORT_SYMBOL(kunmap_atomic_high); -- --/* -- * This is the same as kmap_atomic() but can map memory that doesn't -- * have a struct page associated with it. -- */ --void *kmap_atomic_pfn(unsigned long pfn) --{ -- unsigned long vaddr; -- int idx, type; -- -- preempt_disable(); -- pagefault_disable(); -- -- type = kmap_atomic_idx_push(); -- idx = type + KM_TYPE_NR*smp_processor_id(); -- vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); -- set_pte(kmap_pte-idx, pfn_pte(pfn, PAGE_KERNEL)); -- flush_tlb_one(vaddr); -- -- return (void*) vaddr; --} -- --void __init kmap_init(void) --{ -- unsigned long kmap_vstart; -- -- /* cache the first kmap pte */ -- kmap_vstart = __fix_to_virt(FIX_KMAP_BEGIN); -- kmap_pte = virt_to_kpte(kmap_vstart); --} -diff --git a/arch/mips/mm/init.c b/arch/mips/mm/init.c -index 07e84a774938..bc80893e5c0f 100644 ---- a/arch/mips/mm/init.c -+++ b/arch/mips/mm/init.c -@@ -36,7 +36,6 @@ - #include <asm/cachectl.h> - #include <asm/cpu.h> - #include <asm/dma.h> --#include <asm/kmap_types.h> - #include <asm/maar.h> - #include <asm/mmu_context.h> - #include <asm/sections.h> -@@ -402,9 +401,6 @@ void __init paging_init(void) - - pagetable_init(); - --#ifdef CONFIG_HIGHMEM -- kmap_init(); --#endif - #ifdef CONFIG_ZONE_DMA - max_zone_pfns[ZONE_DMA] = MAX_DMA_PFN; - #endif --- -2.30.2 - diff --git a/debian/patches-rt/0037-nds32-mm-highmem-Switch-to-generic-kmap-atomic.patch b/debian/patches-rt/0037-nds32-mm-highmem-Switch-to-generic-kmap-atomic.patch deleted file mode 100644 index 68672e6c4..000000000 --- a/debian/patches-rt/0037-nds32-mm-highmem-Switch-to-generic-kmap-atomic.patch +++ /dev/null @@ -1,167 +0,0 @@ -From f3b1266a1298b76eae3cebc2342107185dfb0976 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:26 +0100 -Subject: [PATCH 037/296] nds32/mm/highmem: Switch to generic kmap atomic -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The mapping code is odd and looks broken. See FIXME in the comment. - -Also fix the harmless off by one in the FIX_KMAP_END define. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Cc: Nick Hu <nickhu@andestech.com> -Cc: Greentime Hu <green.hu@gmail.com> -Cc: Vincent Chen <deanbo422@gmail.com> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/nds32/Kconfig.cpu | 1 + - arch/nds32/include/asm/fixmap.h | 4 +-- - arch/nds32/include/asm/highmem.h | 22 +++++++++++---- - arch/nds32/mm/Makefile | 1 - - arch/nds32/mm/highmem.c | 48 -------------------------------- - 5 files changed, 19 insertions(+), 57 deletions(-) - delete mode 100644 arch/nds32/mm/highmem.c - -diff --git a/arch/nds32/Kconfig.cpu b/arch/nds32/Kconfig.cpu -index f88a12fdf0f3..c10759952485 100644 ---- a/arch/nds32/Kconfig.cpu -+++ b/arch/nds32/Kconfig.cpu -@@ -157,6 +157,7 @@ config HW_SUPPORT_UNALIGNMENT_ACCESS - config HIGHMEM - bool "High Memory Support" - depends on MMU && !CPU_CACHE_ALIASING -+ select KMAP_LOCAL - help - The address space of Andes processors is only 4 Gigabytes large - and it has to accommodate user address space, kernel address -diff --git a/arch/nds32/include/asm/fixmap.h b/arch/nds32/include/asm/fixmap.h -index 5a4bf11e5800..2fa09a2de428 100644 ---- a/arch/nds32/include/asm/fixmap.h -+++ b/arch/nds32/include/asm/fixmap.h -@@ -6,7 +6,7 @@ - - #ifdef CONFIG_HIGHMEM - #include <linux/threads.h> --#include <asm/kmap_types.h> -+#include <asm/kmap_size.h> - #endif - - enum fixed_addresses { -@@ -14,7 +14,7 @@ enum fixed_addresses { - FIX_KMAP_RESERVED, - FIX_KMAP_BEGIN, - #ifdef CONFIG_HIGHMEM -- FIX_KMAP_END = FIX_KMAP_BEGIN + (KM_TYPE_NR * NR_CPUS), -+ FIX_KMAP_END = FIX_KMAP_BEGIN + (KM_MAX_IDX * NR_CPUS) - 1, - #endif - FIX_EARLYCON_MEM_BASE, - __end_of_fixed_addresses -diff --git a/arch/nds32/include/asm/highmem.h b/arch/nds32/include/asm/highmem.h -index fe986d0e6e3f..16159a8716f2 100644 ---- a/arch/nds32/include/asm/highmem.h -+++ b/arch/nds32/include/asm/highmem.h -@@ -5,7 +5,6 @@ - #define _ASM_HIGHMEM_H - - #include <asm/proc-fns.h> --#include <asm/kmap_types.h> - #include <asm/fixmap.h> - - /* -@@ -45,11 +44,22 @@ extern pte_t *pkmap_page_table; - extern void kmap_init(void); - - /* -- * The following functions are already defined by <linux/highmem.h> -- * when CONFIG_HIGHMEM is not set. -+ * FIXME: The below looks broken vs. a kmap_atomic() in task context which -+ * is interupted and another kmap_atomic() happens in interrupt context. -+ * But what do I know about nds32. -- tglx - */ --#ifdef CONFIG_HIGHMEM --extern void *kmap_atomic_pfn(unsigned long pfn); --#endif -+#define arch_kmap_local_post_map(vaddr, pteval) \ -+ do { \ -+ __nds32__tlbop_inv(vaddr); \ -+ __nds32__mtsr_dsb(vaddr, NDS32_SR_TLB_VPN); \ -+ __nds32__tlbop_rwr(pteval); \ -+ __nds32__isb(); \ -+ } while (0) -+ -+#define arch_kmap_local_pre_unmap(vaddr) \ -+ do { \ -+ __nds32__tlbop_inv(vaddr); \ -+ __nds32__isb(); \ -+ } while (0) - - #endif -diff --git a/arch/nds32/mm/Makefile b/arch/nds32/mm/Makefile -index 897ecaf5cf54..14fb2e8eb036 100644 ---- a/arch/nds32/mm/Makefile -+++ b/arch/nds32/mm/Makefile -@@ -3,7 +3,6 @@ obj-y := extable.o tlb.o fault.o init.o mmap.o \ - mm-nds32.o cacheflush.o proc.o - - obj-$(CONFIG_ALIGNMENT_TRAP) += alignment.o --obj-$(CONFIG_HIGHMEM) += highmem.o - - ifdef CONFIG_FUNCTION_TRACER - CFLAGS_REMOVE_proc.o = $(CC_FLAGS_FTRACE) -diff --git a/arch/nds32/mm/highmem.c b/arch/nds32/mm/highmem.c -deleted file mode 100644 -index 4284cd59e21a..000000000000 ---- a/arch/nds32/mm/highmem.c -+++ /dev/null -@@ -1,48 +0,0 @@ --// SPDX-License-Identifier: GPL-2.0 --// Copyright (C) 2005-2017 Andes Technology Corporation -- --#include <linux/export.h> --#include <linux/highmem.h> --#include <linux/sched.h> --#include <linux/smp.h> --#include <linux/interrupt.h> --#include <linux/memblock.h> --#include <asm/fixmap.h> --#include <asm/tlbflush.h> -- --void *kmap_atomic_high_prot(struct page *page, pgprot_t prot) --{ -- unsigned int idx; -- unsigned long vaddr, pte; -- int type; -- pte_t *ptep; -- -- type = kmap_atomic_idx_push(); -- -- idx = type + KM_TYPE_NR * smp_processor_id(); -- vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); -- pte = (page_to_pfn(page) << PAGE_SHIFT) | prot; -- ptep = pte_offset_kernel(pmd_off_k(vaddr), vaddr); -- set_pte(ptep, pte); -- -- __nds32__tlbop_inv(vaddr); -- __nds32__mtsr_dsb(vaddr, NDS32_SR_TLB_VPN); -- __nds32__tlbop_rwr(pte); -- __nds32__isb(); -- return (void *)vaddr; --} --EXPORT_SYMBOL(kmap_atomic_high_prot); -- --void kunmap_atomic_high(void *kvaddr) --{ -- if (kvaddr >= (void *)FIXADDR_START) { -- unsigned long vaddr = (unsigned long)kvaddr; -- pte_t *ptep; -- kmap_atomic_idx_pop(); -- __nds32__tlbop_inv(vaddr); -- __nds32__isb(); -- ptep = pte_offset_kernel(pmd_off_k(vaddr), vaddr); -- set_pte(ptep, 0); -- } --} --EXPORT_SYMBOL(kunmap_atomic_high); --- -2.30.2 - diff --git a/debian/patches-rt/0038-powerpc-mm-highmem-Switch-to-generic-kmap-atomic.patch b/debian/patches-rt/0038-powerpc-mm-highmem-Switch-to-generic-kmap-atomic.patch deleted file mode 100644 index 407b9618a..000000000 --- a/debian/patches-rt/0038-powerpc-mm-highmem-Switch-to-generic-kmap-atomic.patch +++ /dev/null @@ -1,202 +0,0 @@ -From c9127f8713ccd642ff949ef684c3278ceac39f8e Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:27 +0100 -Subject: [PATCH 038/296] powerpc/mm/highmem: Switch to generic kmap atomic -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -No reason having the same code in every architecture - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Cc: Michael Ellerman <mpe@ellerman.id.au> -Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> -Cc: Paul Mackerras <paulus@samba.org> -Cc: linuxppc-dev@lists.ozlabs.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/powerpc/Kconfig | 1 + - arch/powerpc/include/asm/fixmap.h | 4 +- - arch/powerpc/include/asm/highmem.h | 7 ++- - arch/powerpc/include/asm/kmap_types.h | 13 ------ - arch/powerpc/mm/Makefile | 1 - - arch/powerpc/mm/highmem.c | 67 --------------------------- - arch/powerpc/mm/mem.c | 7 --- - 7 files changed, 8 insertions(+), 92 deletions(-) - delete mode 100644 arch/powerpc/include/asm/kmap_types.h - delete mode 100644 arch/powerpc/mm/highmem.c - ---- a/arch/powerpc/Kconfig -+++ b/arch/powerpc/Kconfig -@@ -410,6 +410,7 @@ - config HIGHMEM - bool "High memory support" - depends on PPC32 -+ select KMAP_LOCAL - - source "kernel/Kconfig.hz" - ---- a/arch/powerpc/include/asm/fixmap.h -+++ b/arch/powerpc/include/asm/fixmap.h -@@ -20,7 +20,7 @@ - #include <asm/page.h> - #ifdef CONFIG_HIGHMEM - #include <linux/threads.h> --#include <asm/kmap_types.h> -+#include <asm/kmap_size.h> - #endif - - #ifdef CONFIG_PPC64 -@@ -61,7 +61,7 @@ - FIX_EARLY_DEBUG_BASE = FIX_EARLY_DEBUG_TOP+(ALIGN(SZ_128K, PAGE_SIZE)/PAGE_SIZE)-1, - #ifdef CONFIG_HIGHMEM - FIX_KMAP_BEGIN, /* reserved pte's for temporary kernel mappings */ -- FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1, -+ FIX_KMAP_END = FIX_KMAP_BEGIN + (KM_MAX_IDX * NR_CPUS) - 1, - #endif - #ifdef CONFIG_PPC_8xx - /* For IMMR we need an aligned 512K area */ ---- a/arch/powerpc/include/asm/highmem.h -+++ b/arch/powerpc/include/asm/highmem.h -@@ -24,12 +24,10 @@ - #ifdef __KERNEL__ - - #include <linux/interrupt.h> --#include <asm/kmap_types.h> - #include <asm/cacheflush.h> - #include <asm/page.h> - #include <asm/fixmap.h> - --extern pte_t *kmap_pte; - extern pte_t *pkmap_page_table; - - /* -@@ -60,6 +58,11 @@ - - #define flush_cache_kmaps() flush_cache_all() - -+#define arch_kmap_local_post_map(vaddr, pteval) \ -+ local_flush_tlb_page(NULL, vaddr) -+#define arch_kmap_local_post_unmap(vaddr) \ -+ local_flush_tlb_page(NULL, vaddr) -+ - #endif /* __KERNEL__ */ - - #endif /* _ASM_HIGHMEM_H */ ---- a/arch/powerpc/include/asm/kmap_types.h -+++ /dev/null -@@ -1,13 +0,0 @@ --/* SPDX-License-Identifier: GPL-2.0-or-later */ --#ifndef _ASM_POWERPC_KMAP_TYPES_H --#define _ASM_POWERPC_KMAP_TYPES_H -- --#ifdef __KERNEL__ -- --/* -- */ -- --#define KM_TYPE_NR 16 -- --#endif /* __KERNEL__ */ --#endif /* _ASM_POWERPC_KMAP_TYPES_H */ ---- a/arch/powerpc/mm/Makefile -+++ b/arch/powerpc/mm/Makefile -@@ -16,7 +16,6 @@ - obj-$(CONFIG_PPC_MM_SLICES) += slice.o - obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o - obj-$(CONFIG_NOT_COHERENT_CACHE) += dma-noncoherent.o --obj-$(CONFIG_HIGHMEM) += highmem.o - obj-$(CONFIG_PPC_COPRO_BASE) += copro_fault.o - obj-$(CONFIG_PPC_PTDUMP) += ptdump/ - obj-$(CONFIG_KASAN) += kasan/ ---- a/arch/powerpc/mm/highmem.c -+++ /dev/null -@@ -1,67 +0,0 @@ --// SPDX-License-Identifier: GPL-2.0 --/* -- * highmem.c: virtual kernel memory mappings for high memory -- * -- * PowerPC version, stolen from the i386 version. -- * -- * Used in CONFIG_HIGHMEM systems for memory pages which -- * are not addressable by direct kernel virtual addresses. -- * -- * Copyright (C) 1999 Gerhard Wichert, Siemens AG -- * Gerhard.Wichert@pdb.siemens.de -- * -- * -- * Redesigned the x86 32-bit VM architecture to deal with -- * up to 16 Terrabyte physical memory. With current x86 CPUs -- * we now support up to 64 Gigabytes physical RAM. -- * -- * Copyright (C) 1999 Ingo Molnar <mingo@redhat.com> -- * -- * Reworked for PowerPC by various contributors. Moved from -- * highmem.h by Benjamin Herrenschmidt (c) 2009 IBM Corp. -- */ -- --#include <linux/highmem.h> --#include <linux/module.h> -- --void *kmap_atomic_high_prot(struct page *page, pgprot_t prot) --{ -- unsigned long vaddr; -- int idx, type; -- -- type = kmap_atomic_idx_push(); -- idx = type + KM_TYPE_NR*smp_processor_id(); -- vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); -- WARN_ON(IS_ENABLED(CONFIG_DEBUG_HIGHMEM) && !pte_none(*(kmap_pte - idx))); -- __set_pte_at(&init_mm, vaddr, kmap_pte-idx, mk_pte(page, prot), 1); -- local_flush_tlb_page(NULL, vaddr); -- -- return (void*) vaddr; --} --EXPORT_SYMBOL(kmap_atomic_high_prot); -- --void kunmap_atomic_high(void *kvaddr) --{ -- unsigned long vaddr = (unsigned long) kvaddr & PAGE_MASK; -- -- if (vaddr < __fix_to_virt(FIX_KMAP_END)) -- return; -- -- if (IS_ENABLED(CONFIG_DEBUG_HIGHMEM)) { -- int type = kmap_atomic_idx(); -- unsigned int idx; -- -- idx = type + KM_TYPE_NR * smp_processor_id(); -- WARN_ON(vaddr != __fix_to_virt(FIX_KMAP_BEGIN + idx)); -- -- /* -- * force other mappings to Oops if they'll try to access -- * this pte without first remap it -- */ -- pte_clear(&init_mm, vaddr, kmap_pte-idx); -- local_flush_tlb_page(NULL, vaddr); -- } -- -- kmap_atomic_idx_pop(); --} --EXPORT_SYMBOL(kunmap_atomic_high); ---- a/arch/powerpc/mm/mem.c -+++ b/arch/powerpc/mm/mem.c -@@ -62,11 +62,6 @@ - unsigned long long memory_limit; - bool init_mem_is_free; - --#ifdef CONFIG_HIGHMEM --pte_t *kmap_pte; --EXPORT_SYMBOL(kmap_pte); --#endif -- - pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn, - unsigned long size, pgprot_t vma_prot) - { -@@ -236,8 +231,6 @@ - - map_kernel_page(PKMAP_BASE, 0, __pgprot(0)); /* XXX gross */ - pkmap_page_table = virt_to_kpte(PKMAP_BASE); -- -- kmap_pte = virt_to_kpte(__fix_to_virt(FIX_KMAP_BEGIN)); - #endif /* CONFIG_HIGHMEM */ - - printk(KERN_DEBUG "Top of RAM: 0x%llx, Total RAM: 0x%llx\n", diff --git a/debian/patches-rt/0039-sparc-mm-highmem-Switch-to-generic-kmap-atomic.patch b/debian/patches-rt/0039-sparc-mm-highmem-Switch-to-generic-kmap-atomic.patch deleted file mode 100644 index faf2a29cd..000000000 --- a/debian/patches-rt/0039-sparc-mm-highmem-Switch-to-generic-kmap-atomic.patch +++ /dev/null @@ -1,254 +0,0 @@ -From b16f456359f189f1a356ddc983f4553cd235be0f Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:28 +0100 -Subject: [PATCH 039/296] sparc/mm/highmem: Switch to generic kmap atomic -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -No reason having the same code in every architecture - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Cc: "David S. Miller" <davem@davemloft.net> -Cc: sparclinux@vger.kernel.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/sparc/Kconfig | 1 + - arch/sparc/include/asm/highmem.h | 8 +- - arch/sparc/include/asm/kmap_types.h | 11 --- - arch/sparc/include/asm/vaddrs.h | 4 +- - arch/sparc/mm/Makefile | 3 - - arch/sparc/mm/highmem.c | 115 ---------------------------- - arch/sparc/mm/srmmu.c | 2 - - 7 files changed, 8 insertions(+), 136 deletions(-) - delete mode 100644 arch/sparc/include/asm/kmap_types.h - delete mode 100644 arch/sparc/mm/highmem.c - -diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig -index 530b7ec5d3ca..a38d00d8b783 100644 ---- a/arch/sparc/Kconfig -+++ b/arch/sparc/Kconfig -@@ -139,6 +139,7 @@ config MMU - config HIGHMEM - bool - default y if SPARC32 -+ select KMAP_LOCAL - - config ZONE_DMA - bool -diff --git a/arch/sparc/include/asm/highmem.h b/arch/sparc/include/asm/highmem.h -index 6c35f0d27ee1..875116209ec1 100644 ---- a/arch/sparc/include/asm/highmem.h -+++ b/arch/sparc/include/asm/highmem.h -@@ -24,7 +24,6 @@ - #include <linux/interrupt.h> - #include <linux/pgtable.h> - #include <asm/vaddrs.h> --#include <asm/kmap_types.h> - #include <asm/pgtsrmmu.h> - - /* declarations for highmem.c */ -@@ -33,8 +32,6 @@ extern unsigned long highstart_pfn, highend_pfn; - #define kmap_prot __pgprot(SRMMU_ET_PTE | SRMMU_PRIV | SRMMU_CACHE) - extern pte_t *pkmap_page_table; - --void kmap_init(void) __init; -- - /* - * Right now we initialize only a single pte table. It can be extended - * easily, subsequent pte tables have to be allocated in one physical -@@ -53,6 +50,11 @@ void kmap_init(void) __init; - - #define flush_cache_kmaps() flush_cache_all() - -+/* FIXME: Use __flush_tlb_one(vaddr) instead of flush_cache_all() -- Anton */ -+#define arch_kmap_local_post_map(vaddr, pteval) flush_cache_all() -+#define arch_kmap_local_post_unmap(vaddr) flush_cache_all() -+ -+ - #endif /* __KERNEL__ */ - - #endif /* _ASM_HIGHMEM_H */ -diff --git a/arch/sparc/include/asm/kmap_types.h b/arch/sparc/include/asm/kmap_types.h -deleted file mode 100644 -index 55a99b6bd91e..000000000000 ---- a/arch/sparc/include/asm/kmap_types.h -+++ /dev/null -@@ -1,11 +0,0 @@ --/* SPDX-License-Identifier: GPL-2.0 */ --#ifndef _ASM_KMAP_TYPES_H --#define _ASM_KMAP_TYPES_H -- --/* Dummy header just to define km_type. None of this -- * is actually used on sparc. -DaveM -- */ -- --#include <asm-generic/kmap_types.h> -- --#endif -diff --git a/arch/sparc/include/asm/vaddrs.h b/arch/sparc/include/asm/vaddrs.h -index 84d054b07a6f..4fec0341e2a8 100644 ---- a/arch/sparc/include/asm/vaddrs.h -+++ b/arch/sparc/include/asm/vaddrs.h -@@ -32,13 +32,13 @@ - #define SRMMU_NOCACHE_ALCRATIO 64 /* 256 pages per 64MB of system RAM */ - - #ifndef __ASSEMBLY__ --#include <asm/kmap_types.h> -+#include <asm/kmap_size.h> - - enum fixed_addresses { - FIX_HOLE, - #ifdef CONFIG_HIGHMEM - FIX_KMAP_BEGIN, -- FIX_KMAP_END = (KM_TYPE_NR * NR_CPUS), -+ FIX_KMAP_END = (KM_MAX_IDX * NR_CPUS), - #endif - __end_of_fixed_addresses - }; -diff --git a/arch/sparc/mm/Makefile b/arch/sparc/mm/Makefile -index b078205b70e0..68db1f859b02 100644 ---- a/arch/sparc/mm/Makefile -+++ b/arch/sparc/mm/Makefile -@@ -15,6 +15,3 @@ obj-$(CONFIG_SPARC32) += leon_mm.o - - # Only used by sparc64 - obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o -- --# Only used by sparc32 --obj-$(CONFIG_HIGHMEM) += highmem.o -diff --git a/arch/sparc/mm/highmem.c b/arch/sparc/mm/highmem.c -deleted file mode 100644 -index 8f2a2afb048a..000000000000 ---- a/arch/sparc/mm/highmem.c -+++ /dev/null -@@ -1,115 +0,0 @@ --// SPDX-License-Identifier: GPL-2.0 --/* -- * highmem.c: virtual kernel memory mappings for high memory -- * -- * Provides kernel-static versions of atomic kmap functions originally -- * found as inlines in include/asm-sparc/highmem.h. These became -- * needed as kmap_atomic() and kunmap_atomic() started getting -- * called from within modules. -- * -- Tomas Szepe <szepe@pinerecords.com>, September 2002 -- * -- * But kmap_atomic() and kunmap_atomic() cannot be inlined in -- * modules because they are loaded with btfixup-ped functions. -- */ -- --/* -- * The use of kmap_atomic/kunmap_atomic is discouraged - kmap/kunmap -- * gives a more generic (and caching) interface. But kmap_atomic can -- * be used in IRQ contexts, so in some (very limited) cases we need it. -- * -- * XXX This is an old text. Actually, it's good to use atomic kmaps, -- * provided you remember that they are atomic and not try to sleep -- * with a kmap taken, much like a spinlock. Non-atomic kmaps are -- * shared by CPUs, and so precious, and establishing them requires IPI. -- * Atomic kmaps are lightweight and we may have NCPUS more of them. -- */ --#include <linux/highmem.h> --#include <linux/export.h> --#include <linux/mm.h> -- --#include <asm/cacheflush.h> --#include <asm/tlbflush.h> --#include <asm/vaddrs.h> -- --static pte_t *kmap_pte; -- --void __init kmap_init(void) --{ -- unsigned long address = __fix_to_virt(FIX_KMAP_BEGIN); -- -- /* cache the first kmap pte */ -- kmap_pte = virt_to_kpte(address); --} -- --void *kmap_atomic_high_prot(struct page *page, pgprot_t prot) --{ -- unsigned long vaddr; -- long idx, type; -- -- type = kmap_atomic_idx_push(); -- idx = type + KM_TYPE_NR*smp_processor_id(); -- vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); -- --/* XXX Fix - Anton */ --#if 0 -- __flush_cache_one(vaddr); --#else -- flush_cache_all(); --#endif -- --#ifdef CONFIG_DEBUG_HIGHMEM -- BUG_ON(!pte_none(*(kmap_pte-idx))); --#endif -- set_pte(kmap_pte-idx, mk_pte(page, prot)); --/* XXX Fix - Anton */ --#if 0 -- __flush_tlb_one(vaddr); --#else -- flush_tlb_all(); --#endif -- -- return (void*) vaddr; --} --EXPORT_SYMBOL(kmap_atomic_high_prot); -- --void kunmap_atomic_high(void *kvaddr) --{ -- unsigned long vaddr = (unsigned long) kvaddr & PAGE_MASK; -- int type; -- -- if (vaddr < FIXADDR_START) -- return; -- -- type = kmap_atomic_idx(); -- --#ifdef CONFIG_DEBUG_HIGHMEM -- { -- unsigned long idx; -- -- idx = type + KM_TYPE_NR * smp_processor_id(); -- BUG_ON(vaddr != __fix_to_virt(FIX_KMAP_BEGIN+idx)); -- -- /* XXX Fix - Anton */ --#if 0 -- __flush_cache_one(vaddr); --#else -- flush_cache_all(); --#endif -- -- /* -- * force other mappings to Oops if they'll try to access -- * this pte without first remap it -- */ -- pte_clear(&init_mm, vaddr, kmap_pte-idx); -- /* XXX Fix - Anton */ --#if 0 -- __flush_tlb_one(vaddr); --#else -- flush_tlb_all(); --#endif -- } --#endif -- -- kmap_atomic_idx_pop(); --} --EXPORT_SYMBOL(kunmap_atomic_high); -diff --git a/arch/sparc/mm/srmmu.c b/arch/sparc/mm/srmmu.c -index 0070f8b9a753..a03caa5f6628 100644 ---- a/arch/sparc/mm/srmmu.c -+++ b/arch/sparc/mm/srmmu.c -@@ -971,8 +971,6 @@ void __init srmmu_paging_init(void) - - sparc_context_init(num_contexts); - -- kmap_init(); -- - { - unsigned long max_zone_pfn[MAX_NR_ZONES] = { 0 }; - --- -2.30.2 - diff --git a/debian/patches-rt/0040-xtensa-mm-highmem-Switch-to-generic-kmap-atomic.patch b/debian/patches-rt/0040-xtensa-mm-highmem-Switch-to-generic-kmap-atomic.patch deleted file mode 100644 index 9a0add4fa..000000000 --- a/debian/patches-rt/0040-xtensa-mm-highmem-Switch-to-generic-kmap-atomic.patch +++ /dev/null @@ -1,166 +0,0 @@ -From 2f356ff117fa939a9b101a9d15d8a1f6da4745e4 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:29 +0100 -Subject: [PATCH 040/296] xtensa/mm/highmem: Switch to generic kmap atomic -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -No reason having the same code in every architecture - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Cc: Chris Zankel <chris@zankel.net> -Cc: Max Filippov <jcmvbkbc@gmail.com> -Cc: linux-xtensa@linux-xtensa.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/xtensa/Kconfig | 1 + - arch/xtensa/include/asm/fixmap.h | 4 +-- - arch/xtensa/include/asm/highmem.h | 12 ++++++-- - arch/xtensa/mm/highmem.c | 46 ++++--------------------------- - 4 files changed, 18 insertions(+), 45 deletions(-) - -diff --git a/arch/xtensa/Kconfig b/arch/xtensa/Kconfig -index d0dfa50bd0bb..dc22ef3cf4be 100644 ---- a/arch/xtensa/Kconfig -+++ b/arch/xtensa/Kconfig -@@ -666,6 +666,7 @@ endchoice - config HIGHMEM - bool "High Memory Support" - depends on MMU -+ select KMAP_LOCAL - help - Linux can use the full amount of RAM in the system by - default. However, the default MMUv2 setup only maps the -diff --git a/arch/xtensa/include/asm/fixmap.h b/arch/xtensa/include/asm/fixmap.h -index a06ffb0c61c7..92049b61c351 100644 ---- a/arch/xtensa/include/asm/fixmap.h -+++ b/arch/xtensa/include/asm/fixmap.h -@@ -16,7 +16,7 @@ - #ifdef CONFIG_HIGHMEM - #include <linux/threads.h> - #include <linux/pgtable.h> --#include <asm/kmap_types.h> -+#include <asm/kmap_size.h> - #endif - - /* -@@ -39,7 +39,7 @@ enum fixed_addresses { - /* reserved pte's for temporary kernel mappings */ - FIX_KMAP_BEGIN, - FIX_KMAP_END = FIX_KMAP_BEGIN + -- (KM_TYPE_NR * NR_CPUS * DCACHE_N_COLORS) - 1, -+ (KM_MAX_IDX * NR_CPUS * DCACHE_N_COLORS) - 1, - #endif - __end_of_fixed_addresses - }; -diff --git a/arch/xtensa/include/asm/highmem.h b/arch/xtensa/include/asm/highmem.h -index eac503215f17..0fc3b1cebc56 100644 ---- a/arch/xtensa/include/asm/highmem.h -+++ b/arch/xtensa/include/asm/highmem.h -@@ -16,9 +16,8 @@ - #include <linux/pgtable.h> - #include <asm/cacheflush.h> - #include <asm/fixmap.h> --#include <asm/kmap_types.h> - --#define PKMAP_BASE ((FIXADDR_START - \ -+#define PKMAP_BASE ((FIXADDR_START - \ - (LAST_PKMAP + 1) * PAGE_SIZE) & PMD_MASK) - #define LAST_PKMAP (PTRS_PER_PTE * DCACHE_N_COLORS) - #define LAST_PKMAP_MASK (LAST_PKMAP - 1) -@@ -68,6 +67,15 @@ static inline void flush_cache_kmaps(void) - flush_cache_all(); - } - -+enum fixed_addresses kmap_local_map_idx(int type, unsigned long pfn); -+#define arch_kmap_local_map_idx kmap_local_map_idx -+ -+enum fixed_addresses kmap_local_unmap_idx(int type, unsigned long addr); -+#define arch_kmap_local_unmap_idx kmap_local_unmap_idx -+ -+#define arch_kmap_local_post_unmap(vaddr) \ -+ local_flush_tlb_kernel_range(vaddr, vaddr + PAGE_SIZE) -+ - void kmap_init(void); - - #endif -diff --git a/arch/xtensa/mm/highmem.c b/arch/xtensa/mm/highmem.c -index 673196fe862e..0735ca5e8f86 100644 ---- a/arch/xtensa/mm/highmem.c -+++ b/arch/xtensa/mm/highmem.c -@@ -12,8 +12,6 @@ - #include <linux/highmem.h> - #include <asm/tlbflush.h> - --static pte_t *kmap_pte; -- - #if DCACHE_WAY_SIZE > PAGE_SIZE - unsigned int last_pkmap_nr_arr[DCACHE_N_COLORS]; - wait_queue_head_t pkmap_map_wait_arr[DCACHE_N_COLORS]; -@@ -33,59 +31,25 @@ static inline void kmap_waitqueues_init(void) - - static inline enum fixed_addresses kmap_idx(int type, unsigned long color) - { -- return (type + KM_TYPE_NR * smp_processor_id()) * DCACHE_N_COLORS + -+ return (type + KM_MAX_IDX * smp_processor_id()) * DCACHE_N_COLORS + - color; - } - --void *kmap_atomic_high_prot(struct page *page, pgprot_t prot) -+enum fixed_addresses kmap_local_map_idx(int type, unsigned long pfn) - { -- enum fixed_addresses idx; -- unsigned long vaddr; -- -- idx = kmap_idx(kmap_atomic_idx_push(), -- DCACHE_ALIAS(page_to_phys(page))); -- vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); --#ifdef CONFIG_DEBUG_HIGHMEM -- BUG_ON(!pte_none(*(kmap_pte + idx))); --#endif -- set_pte(kmap_pte + idx, mk_pte(page, prot)); -- -- return (void *)vaddr; -+ return kmap_idx(type, DCACHE_ALIAS(pfn << PAGE_SHIFT)); - } --EXPORT_SYMBOL(kmap_atomic_high_prot); - --void kunmap_atomic_high(void *kvaddr) -+enum fixed_addresses kmap_local_unmap_idx(int type, unsigned long addr) - { -- if (kvaddr >= (void *)FIXADDR_START && -- kvaddr < (void *)FIXADDR_TOP) { -- int idx = kmap_idx(kmap_atomic_idx(), -- DCACHE_ALIAS((unsigned long)kvaddr)); -- -- /* -- * Force other mappings to Oops if they'll try to access this -- * pte without first remap it. Keeping stale mappings around -- * is a bad idea also, in case the page changes cacheability -- * attributes or becomes a protected page in a hypervisor. -- */ -- pte_clear(&init_mm, kvaddr, kmap_pte + idx); -- local_flush_tlb_kernel_range((unsigned long)kvaddr, -- (unsigned long)kvaddr + PAGE_SIZE); -- -- kmap_atomic_idx_pop(); -- } -+ return kmap_idx(type, DCACHE_ALIAS(addr)); - } --EXPORT_SYMBOL(kunmap_atomic_high); - - void __init kmap_init(void) - { -- unsigned long kmap_vstart; -- - /* Check if this memory layout is broken because PKMAP overlaps - * page table. - */ - BUILD_BUG_ON(PKMAP_BASE < TLBTEMP_BASE_1 + TLBTEMP_SIZE); -- /* cache the first kmap pte */ -- kmap_vstart = __fix_to_virt(FIX_KMAP_BEGIN); -- kmap_pte = virt_to_kpte(kmap_vstart); - kmap_waitqueues_init(); - } --- -2.30.2 - diff --git a/debian/patches-rt/0041-highmem-Get-rid-of-kmap_types.h.patch b/debian/patches-rt/0041-highmem-Get-rid-of-kmap_types.h.patch deleted file mode 100644 index 2449a183c..000000000 --- a/debian/patches-rt/0041-highmem-Get-rid-of-kmap_types.h.patch +++ /dev/null @@ -1,189 +0,0 @@ -From 04e284e4d65e19d2450fed877188f86c02a235c0 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:30 +0100 -Subject: [PATCH 041/296] highmem: Get rid of kmap_types.h -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The header is not longer used and on alpha, ia64, openrisc, parisc and um -it was completely unused anyway as these architectures have no highmem -support. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/alpha/include/asm/kmap_types.h | 15 --------------- - arch/ia64/include/asm/kmap_types.h | 13 ------------- - arch/openrisc/mm/init.c | 1 - - arch/openrisc/mm/ioremap.c | 1 - - arch/parisc/include/asm/kmap_types.h | 13 ------------- - arch/um/include/asm/fixmap.h | 1 - - arch/um/include/asm/kmap_types.h | 13 ------------- - include/asm-generic/Kbuild | 1 - - include/asm-generic/kmap_types.h | 11 ----------- - include/linux/highmem.h | 2 -- - 10 files changed, 71 deletions(-) - delete mode 100644 arch/alpha/include/asm/kmap_types.h - delete mode 100644 arch/ia64/include/asm/kmap_types.h - delete mode 100644 arch/parisc/include/asm/kmap_types.h - delete mode 100644 arch/um/include/asm/kmap_types.h - delete mode 100644 include/asm-generic/kmap_types.h - -diff --git a/arch/alpha/include/asm/kmap_types.h b/arch/alpha/include/asm/kmap_types.h -deleted file mode 100644 -index 651714b45729..000000000000 ---- a/arch/alpha/include/asm/kmap_types.h -+++ /dev/null -@@ -1,15 +0,0 @@ --/* SPDX-License-Identifier: GPL-2.0 */ --#ifndef _ASM_KMAP_TYPES_H --#define _ASM_KMAP_TYPES_H -- --/* Dummy header just to define km_type. */ -- --#ifdef CONFIG_DEBUG_HIGHMEM --#define __WITH_KM_FENCE --#endif -- --#include <asm-generic/kmap_types.h> -- --#undef __WITH_KM_FENCE -- --#endif -diff --git a/arch/ia64/include/asm/kmap_types.h b/arch/ia64/include/asm/kmap_types.h -deleted file mode 100644 -index 5c268cf7c2bd..000000000000 ---- a/arch/ia64/include/asm/kmap_types.h -+++ /dev/null -@@ -1,13 +0,0 @@ --/* SPDX-License-Identifier: GPL-2.0 */ --#ifndef _ASM_IA64_KMAP_TYPES_H --#define _ASM_IA64_KMAP_TYPES_H -- --#ifdef CONFIG_DEBUG_HIGHMEM --#define __WITH_KM_FENCE --#endif -- --#include <asm-generic/kmap_types.h> -- --#undef __WITH_KM_FENCE -- --#endif /* _ASM_IA64_KMAP_TYPES_H */ -diff --git a/arch/openrisc/mm/init.c b/arch/openrisc/mm/init.c -index 8348feaaf46e..bf9b2310fc93 100644 ---- a/arch/openrisc/mm/init.c -+++ b/arch/openrisc/mm/init.c -@@ -33,7 +33,6 @@ - #include <asm/io.h> - #include <asm/tlb.h> - #include <asm/mmu_context.h> --#include <asm/kmap_types.h> - #include <asm/fixmap.h> - #include <asm/tlbflush.h> - #include <asm/sections.h> -diff --git a/arch/openrisc/mm/ioremap.c b/arch/openrisc/mm/ioremap.c -index a978590d802d..5aed97a18bac 100644 ---- a/arch/openrisc/mm/ioremap.c -+++ b/arch/openrisc/mm/ioremap.c -@@ -15,7 +15,6 @@ - #include <linux/io.h> - #include <linux/pgtable.h> - #include <asm/pgalloc.h> --#include <asm/kmap_types.h> - #include <asm/fixmap.h> - #include <asm/bug.h> - #include <linux/sched.h> -diff --git a/arch/parisc/include/asm/kmap_types.h b/arch/parisc/include/asm/kmap_types.h -deleted file mode 100644 -index 3e70b5cd1123..000000000000 ---- a/arch/parisc/include/asm/kmap_types.h -+++ /dev/null -@@ -1,13 +0,0 @@ --/* SPDX-License-Identifier: GPL-2.0 */ --#ifndef _ASM_KMAP_TYPES_H --#define _ASM_KMAP_TYPES_H -- --#ifdef CONFIG_DEBUG_HIGHMEM --#define __WITH_KM_FENCE --#endif -- --#include <asm-generic/kmap_types.h> -- --#undef __WITH_KM_FENCE -- --#endif -diff --git a/arch/um/include/asm/fixmap.h b/arch/um/include/asm/fixmap.h -index 2c697a145ac1..2efac5827188 100644 ---- a/arch/um/include/asm/fixmap.h -+++ b/arch/um/include/asm/fixmap.h -@@ -3,7 +3,6 @@ - #define __UM_FIXMAP_H - - #include <asm/processor.h> --#include <asm/kmap_types.h> - #include <asm/archparam.h> - #include <asm/page.h> - #include <linux/threads.h> -diff --git a/arch/um/include/asm/kmap_types.h b/arch/um/include/asm/kmap_types.h -deleted file mode 100644 -index b0bd12de1d23..000000000000 ---- a/arch/um/include/asm/kmap_types.h -+++ /dev/null -@@ -1,13 +0,0 @@ --/* SPDX-License-Identifier: GPL-2.0 */ --/* -- * Copyright (C) 2002 Jeff Dike (jdike@karaya.com) -- */ -- --#ifndef __UM_KMAP_TYPES_H --#define __UM_KMAP_TYPES_H -- --/* No more #include "asm/arch/kmap_types.h" ! */ -- --#define KM_TYPE_NR 14 -- --#endif -diff --git a/include/asm-generic/Kbuild b/include/asm-generic/Kbuild -index 3114a6da7e56..267f6dfb8960 100644 ---- a/include/asm-generic/Kbuild -+++ b/include/asm-generic/Kbuild -@@ -30,7 +30,6 @@ mandatory-y += irq.h - mandatory-y += irq_regs.h - mandatory-y += irq_work.h - mandatory-y += kdebug.h --mandatory-y += kmap_types.h - mandatory-y += kmap_size.h - mandatory-y += kprobes.h - mandatory-y += linkage.h -diff --git a/include/asm-generic/kmap_types.h b/include/asm-generic/kmap_types.h -deleted file mode 100644 -index 9f95b7b63d19..000000000000 ---- a/include/asm-generic/kmap_types.h -+++ /dev/null -@@ -1,11 +0,0 @@ --/* SPDX-License-Identifier: GPL-2.0 */ --#ifndef _ASM_GENERIC_KMAP_TYPES_H --#define _ASM_GENERIC_KMAP_TYPES_H -- --#ifdef __WITH_KM_FENCE --# define KM_TYPE_NR 41 --#else --# define KM_TYPE_NR 20 --#endif -- --#endif -diff --git a/include/linux/highmem.h b/include/linux/highmem.h -index 1222a31be842..de78869454b1 100644 ---- a/include/linux/highmem.h -+++ b/include/linux/highmem.h -@@ -29,8 +29,6 @@ static inline void invalidate_kernel_vmap_range(void *vaddr, int size) - } - #endif - --#include <asm/kmap_types.h> -- - /* - * Outside of CONFIG_HIGHMEM to support X86 32bit iomap_atomic() cruft. - */ --- -2.30.2 - diff --git a/debian/patches-rt/0042-mm-highmem-Remove-the-old-kmap_atomic-cruft.patch b/debian/patches-rt/0042-mm-highmem-Remove-the-old-kmap_atomic-cruft.patch deleted file mode 100644 index bc1e4c22a..000000000 --- a/debian/patches-rt/0042-mm-highmem-Remove-the-old-kmap_atomic-cruft.patch +++ /dev/null @@ -1,139 +0,0 @@ -From 9b38edca6f12619cbe82984ba295ed568aa1d1f3 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:31 +0100 -Subject: [PATCH 042/296] mm/highmem: Remove the old kmap_atomic cruft -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -All users gone. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/highmem.h | 63 +++-------------------------------------- - mm/highmem.c | 7 +---- - 2 files changed, 5 insertions(+), 65 deletions(-) - -diff --git a/include/linux/highmem.h b/include/linux/highmem.h -index de78869454b1..3180a8f7307f 100644 ---- a/include/linux/highmem.h -+++ b/include/linux/highmem.h -@@ -86,31 +86,16 @@ static inline void kunmap(struct page *page) - * be used in IRQ contexts, so in some (very limited) cases we need - * it. - */ -- --#ifndef CONFIG_KMAP_LOCAL --void *kmap_atomic_high_prot(struct page *page, pgprot_t prot); --void kunmap_atomic_high(void *kvaddr); -- - static inline void *kmap_atomic_prot(struct page *page, pgprot_t prot) - { - preempt_disable(); - pagefault_disable(); -- if (!PageHighMem(page)) -- return page_address(page); -- return kmap_atomic_high_prot(page, prot); --} -- --static inline void __kunmap_atomic(void *vaddr) --{ -- kunmap_atomic_high(vaddr); -+ return __kmap_local_page_prot(page, prot); - } --#else /* !CONFIG_KMAP_LOCAL */ - --static inline void *kmap_atomic_prot(struct page *page, pgprot_t prot) -+static inline void *kmap_atomic(struct page *page) - { -- preempt_disable(); -- pagefault_disable(); -- return __kmap_local_page_prot(page, prot); -+ return kmap_atomic_prot(page, kmap_prot); - } - - static inline void *kmap_atomic_pfn(unsigned long pfn) -@@ -125,13 +110,6 @@ static inline void __kunmap_atomic(void *addr) - kunmap_local_indexed(addr); - } - --#endif /* CONFIG_KMAP_LOCAL */ -- --static inline void *kmap_atomic(struct page *page) --{ -- return kmap_atomic_prot(page, kmap_prot); --} -- - /* declarations for linux/mm/highmem.c */ - unsigned int nr_free_highpages(void); - extern atomic_long_t _totalhigh_pages; -@@ -212,41 +190,8 @@ static inline void __kunmap_atomic(void *addr) - - #define kmap_flush_unused() do {} while(0) - --#endif /* CONFIG_HIGHMEM */ -- --#if !defined(CONFIG_KMAP_LOCAL) --#if defined(CONFIG_HIGHMEM) -- --DECLARE_PER_CPU(int, __kmap_atomic_idx); -- --static inline int kmap_atomic_idx_push(void) --{ -- int idx = __this_cpu_inc_return(__kmap_atomic_idx) - 1; -- --#ifdef CONFIG_DEBUG_HIGHMEM -- WARN_ON_ONCE(in_irq() && !irqs_disabled()); -- BUG_ON(idx >= KM_TYPE_NR); --#endif -- return idx; --} -- --static inline int kmap_atomic_idx(void) --{ -- return __this_cpu_read(__kmap_atomic_idx) - 1; --} - --static inline void kmap_atomic_idx_pop(void) --{ --#ifdef CONFIG_DEBUG_HIGHMEM -- int idx = __this_cpu_dec_return(__kmap_atomic_idx); -- -- BUG_ON(idx < 0); --#else -- __this_cpu_dec(__kmap_atomic_idx); --#endif --} --#endif --#endif -+#endif /* CONFIG_HIGHMEM */ - - /* - * Prevent people trying to call kunmap_atomic() as if it were kunmap() -diff --git a/mm/highmem.c b/mm/highmem.c -index 77677c6844f7..499dfafd36b7 100644 ---- a/mm/highmem.c -+++ b/mm/highmem.c -@@ -31,12 +31,6 @@ - #include <asm/tlbflush.h> - #include <linux/vmalloc.h> - --#ifndef CONFIG_KMAP_LOCAL --#ifdef CONFIG_HIGHMEM --DEFINE_PER_CPU(int, __kmap_atomic_idx); --#endif --#endif -- - /* - * Virtual_count is not a pure "count". - * 0 means that it is not mapped, and has not been mapped -@@ -410,6 +404,7 @@ static inline void kmap_local_idx_pop(void) - #ifndef arch_kmap_local_post_map - # define arch_kmap_local_post_map(vaddr, pteval) do { } while (0) - #endif -+ - #ifndef arch_kmap_local_pre_unmap - # define arch_kmap_local_pre_unmap(vaddr) do { } while (0) - #endif --- -2.30.2 - diff --git a/debian/patches-rt/0043-io-mapping-Cleanup-atomic-iomap.patch b/debian/patches-rt/0043-io-mapping-Cleanup-atomic-iomap.patch deleted file mode 100644 index a40715129..000000000 --- a/debian/patches-rt/0043-io-mapping-Cleanup-atomic-iomap.patch +++ /dev/null @@ -1,90 +0,0 @@ -From f0c3c6d5598c739d9a5bcc44bd9f28c3d4327cbf Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:32 +0100 -Subject: [PATCH 043/296] io-mapping: Cleanup atomic iomap -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Switch the atomic iomap implementation over to kmap_local and stick the -preempt/pagefault mechanics into the generic code similar to the -kmap_atomic variants. - -Rename the x86 map function in preparation for a non-atomic variant. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/x86/include/asm/iomap.h | 9 +-------- - arch/x86/mm/iomap_32.c | 6 ++---- - include/linux/io-mapping.h | 8 ++++++-- - 3 files changed, 9 insertions(+), 14 deletions(-) - -diff --git a/arch/x86/include/asm/iomap.h b/arch/x86/include/asm/iomap.h -index 0be7a30fd6bc..e2de092fc38c 100644 ---- a/arch/x86/include/asm/iomap.h -+++ b/arch/x86/include/asm/iomap.h -@@ -13,14 +13,7 @@ - #include <asm/cacheflush.h> - #include <asm/tlbflush.h> - --void __iomem *iomap_atomic_pfn_prot(unsigned long pfn, pgprot_t prot); -- --static inline void iounmap_atomic(void __iomem *vaddr) --{ -- kunmap_local_indexed((void __force *)vaddr); -- pagefault_enable(); -- preempt_enable(); --} -+void __iomem *__iomap_local_pfn_prot(unsigned long pfn, pgprot_t prot); - - int iomap_create_wc(resource_size_t base, unsigned long size, pgprot_t *prot); - -diff --git a/arch/x86/mm/iomap_32.c b/arch/x86/mm/iomap_32.c -index e0a40d7cc66c..9aaa756ddf21 100644 ---- a/arch/x86/mm/iomap_32.c -+++ b/arch/x86/mm/iomap_32.c -@@ -44,7 +44,7 @@ void iomap_free(resource_size_t base, unsigned long size) - } - EXPORT_SYMBOL_GPL(iomap_free); - --void __iomem *iomap_atomic_pfn_prot(unsigned long pfn, pgprot_t prot) -+void __iomem *__iomap_local_pfn_prot(unsigned long pfn, pgprot_t prot) - { - /* - * For non-PAT systems, translate non-WB request to UC- just in -@@ -60,8 +60,6 @@ void __iomem *iomap_atomic_pfn_prot(unsigned long pfn, pgprot_t prot) - /* Filter out unsupported __PAGE_KERNEL* bits: */ - pgprot_val(prot) &= __default_kernel_pte_mask; - -- preempt_disable(); -- pagefault_disable(); - return (void __force __iomem *)__kmap_local_pfn_prot(pfn, prot); - } --EXPORT_SYMBOL_GPL(iomap_atomic_pfn_prot); -+EXPORT_SYMBOL_GPL(__iomap_local_pfn_prot); -diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h -index 3b0940be72e9..60e7c83e4904 100644 ---- a/include/linux/io-mapping.h -+++ b/include/linux/io-mapping.h -@@ -69,13 +69,17 @@ io_mapping_map_atomic_wc(struct io_mapping *mapping, - - BUG_ON(offset >= mapping->size); - phys_addr = mapping->base + offset; -- return iomap_atomic_pfn_prot(PHYS_PFN(phys_addr), mapping->prot); -+ preempt_disable(); -+ pagefault_disable(); -+ return __iomap_local_pfn_prot(PHYS_PFN(phys_addr), mapping->prot); - } - - static inline void - io_mapping_unmap_atomic(void __iomem *vaddr) - { -- iounmap_atomic(vaddr); -+ kunmap_local_indexed((void __force *)vaddr); -+ pagefault_enable(); -+ preempt_enable(); - } - - static inline void __iomem * --- -2.30.2 - diff --git a/debian/patches-rt/0044-Documentation-io-mapping-Remove-outdated-blurb.patch b/debian/patches-rt/0044-Documentation-io-mapping-Remove-outdated-blurb.patch deleted file mode 100644 index 3878e1102..000000000 --- a/debian/patches-rt/0044-Documentation-io-mapping-Remove-outdated-blurb.patch +++ /dev/null @@ -1,48 +0,0 @@ -From 52bed77b466037fc3ffa7b231f1767edeb01ebeb Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:33 +0100 -Subject: [PATCH 044/296] Documentation/io-mapping: Remove outdated blurb -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The implementation details in the documentation are outdated and not really -helpful. Remove them. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - Documentation/driver-api/io-mapping.rst | 22 ---------------------- - 1 file changed, 22 deletions(-) - -diff --git a/Documentation/driver-api/io-mapping.rst b/Documentation/driver-api/io-mapping.rst -index a966239f04e4..e33b88268554 100644 ---- a/Documentation/driver-api/io-mapping.rst -+++ b/Documentation/driver-api/io-mapping.rst -@@ -73,25 +73,3 @@ for pages mapped with io_mapping_map_wc. - At driver close time, the io_mapping object must be freed:: - - void io_mapping_free(struct io_mapping *mapping) -- --Current Implementation --====================== -- --The initial implementation of these functions uses existing mapping --mechanisms and so provides only an abstraction layer and no new --functionality. -- --On 64-bit processors, io_mapping_create_wc calls ioremap_wc for the whole --range, creating a permanent kernel-visible mapping to the resource. The --map_atomic and map functions add the requested offset to the base of the --virtual address returned by ioremap_wc. -- --On 32-bit processors with HIGHMEM defined, io_mapping_map_atomic_wc uses --kmap_atomic_pfn to map the specified page in an atomic fashion; --kmap_atomic_pfn isn't really supposed to be used with device pages, but it --provides an efficient mapping for this usage. -- --On 32-bit processors without HIGHMEM defined, io_mapping_map_atomic_wc and --io_mapping_map_wc both use ioremap_wc, a terribly inefficient function which --performs an IPI to inform all processors about the new mapping. This results --in a significant performance penalty. --- -2.30.2 - diff --git a/debian/patches-rt/0045-highmem-High-implementation-details-and-document-API.patch b/debian/patches-rt/0045-highmem-High-implementation-details-and-document-API.patch deleted file mode 100644 index af548bf38..000000000 --- a/debian/patches-rt/0045-highmem-High-implementation-details-and-document-API.patch +++ /dev/null @@ -1,544 +0,0 @@ -From 9f1b994bcdd611f1614aad94dba4528bf24b02d4 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:34 +0100 -Subject: [PATCH 045/296] highmem: High implementation details and document API -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Move the gory details of kmap & al into a private header and only document -the interfaces which are usable by drivers. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/highmem-internal.h | 174 ++++++++++++++++++++ - include/linux/highmem.h | 266 +++++++++++-------------------- - mm/highmem.c | 11 +- - 3 files changed, 274 insertions(+), 177 deletions(-) - create mode 100644 include/linux/highmem-internal.h - -diff --git a/include/linux/highmem-internal.h b/include/linux/highmem-internal.h -new file mode 100644 -index 000000000000..6ceed907b14e ---- /dev/null -+++ b/include/linux/highmem-internal.h -@@ -0,0 +1,174 @@ -+/* SPDX-License-Identifier: GPL-2.0 */ -+#ifndef _LINUX_HIGHMEM_INTERNAL_H -+#define _LINUX_HIGHMEM_INTERNAL_H -+ -+/* -+ * Outside of CONFIG_HIGHMEM to support X86 32bit iomap_atomic() cruft. -+ */ -+#ifdef CONFIG_KMAP_LOCAL -+void *__kmap_local_pfn_prot(unsigned long pfn, pgprot_t prot); -+void *__kmap_local_page_prot(struct page *page, pgprot_t prot); -+void kunmap_local_indexed(void *vaddr); -+#endif -+ -+#ifdef CONFIG_HIGHMEM -+#include <asm/highmem.h> -+ -+#ifndef ARCH_HAS_KMAP_FLUSH_TLB -+static inline void kmap_flush_tlb(unsigned long addr) { } -+#endif -+ -+#ifndef kmap_prot -+#define kmap_prot PAGE_KERNEL -+#endif -+ -+void *kmap_high(struct page *page); -+void kunmap_high(struct page *page); -+void __kmap_flush_unused(void); -+struct page *__kmap_to_page(void *addr); -+ -+static inline void *kmap(struct page *page) -+{ -+ void *addr; -+ -+ might_sleep(); -+ if (!PageHighMem(page)) -+ addr = page_address(page); -+ else -+ addr = kmap_high(page); -+ kmap_flush_tlb((unsigned long)addr); -+ return addr; -+} -+ -+static inline void kunmap(struct page *page) -+{ -+ might_sleep(); -+ if (!PageHighMem(page)) -+ return; -+ kunmap_high(page); -+} -+ -+static inline struct page *kmap_to_page(void *addr) -+{ -+ return __kmap_to_page(addr); -+} -+ -+static inline void kmap_flush_unused(void) -+{ -+ __kmap_flush_unused(); -+} -+ -+static inline void *kmap_atomic_prot(struct page *page, pgprot_t prot) -+{ -+ preempt_disable(); -+ pagefault_disable(); -+ return __kmap_local_page_prot(page, prot); -+} -+ -+static inline void *kmap_atomic(struct page *page) -+{ -+ return kmap_atomic_prot(page, kmap_prot); -+} -+ -+static inline void *kmap_atomic_pfn(unsigned long pfn) -+{ -+ preempt_disable(); -+ pagefault_disable(); -+ return __kmap_local_pfn_prot(pfn, kmap_prot); -+} -+ -+static inline void __kunmap_atomic(void *addr) -+{ -+ kunmap_local_indexed(addr); -+ pagefault_enable(); -+ preempt_enable(); -+} -+ -+unsigned int __nr_free_highpages(void); -+extern atomic_long_t _totalhigh_pages; -+ -+static inline unsigned int nr_free_highpages(void) -+{ -+ return __nr_free_highpages(); -+} -+ -+static inline unsigned long totalhigh_pages(void) -+{ -+ return (unsigned long)atomic_long_read(&_totalhigh_pages); -+} -+ -+static inline void totalhigh_pages_inc(void) -+{ -+ atomic_long_inc(&_totalhigh_pages); -+} -+ -+static inline void totalhigh_pages_add(long count) -+{ -+ atomic_long_add(count, &_totalhigh_pages); -+} -+ -+#else /* CONFIG_HIGHMEM */ -+ -+static inline struct page *kmap_to_page(void *addr) -+{ -+ return virt_to_page(addr); -+} -+ -+static inline void *kmap(struct page *page) -+{ -+ might_sleep(); -+ return page_address(page); -+} -+ -+static inline void kunmap_high(struct page *page) { } -+static inline void kmap_flush_unused(void) { } -+ -+static inline void kunmap(struct page *page) -+{ -+#ifdef ARCH_HAS_FLUSH_ON_KUNMAP -+ kunmap_flush_on_unmap(page_address(page)); -+#endif -+} -+ -+static inline void *kmap_atomic(struct page *page) -+{ -+ preempt_disable(); -+ pagefault_disable(); -+ return page_address(page); -+} -+ -+static inline void *kmap_atomic_prot(struct page *page, pgprot_t prot) -+{ -+ return kmap_atomic(page); -+} -+ -+static inline void *kmap_atomic_pfn(unsigned long pfn) -+{ -+ return kmap_atomic(pfn_to_page(pfn)); -+} -+ -+static inline void __kunmap_atomic(void *addr) -+{ -+#ifdef ARCH_HAS_FLUSH_ON_KUNMAP -+ kunmap_flush_on_unmap(addr); -+#endif -+ pagefault_enable(); -+ preempt_enable(); -+} -+ -+static inline unsigned int nr_free_highpages(void) { return 0; } -+static inline unsigned long totalhigh_pages(void) { return 0UL; } -+ -+#endif /* CONFIG_HIGHMEM */ -+ -+/* -+ * Prevent people trying to call kunmap_atomic() as if it were kunmap() -+ * kunmap_atomic() should get the return value of kmap_atomic, not the page. -+ */ -+#define kunmap_atomic(__addr) \ -+do { \ -+ BUILD_BUG_ON(__same_type((__addr), struct page *)); \ -+ __kunmap_atomic(__addr); \ -+} while (0) -+ -+#endif -diff --git a/include/linux/highmem.h b/include/linux/highmem.h -index 3180a8f7307f..7d098bd621f6 100644 ---- a/include/linux/highmem.h -+++ b/include/linux/highmem.h -@@ -11,199 +11,125 @@ - - #include <asm/cacheflush.h> - --#ifndef ARCH_HAS_FLUSH_ANON_PAGE --static inline void flush_anon_page(struct vm_area_struct *vma, struct page *page, unsigned long vmaddr) --{ --} --#endif -- --#ifndef ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE --static inline void flush_kernel_dcache_page(struct page *page) --{ --} --static inline void flush_kernel_vmap_range(void *vaddr, int size) --{ --} --static inline void invalidate_kernel_vmap_range(void *vaddr, int size) --{ --} --#endif -+#include "highmem-internal.h" - --/* -- * Outside of CONFIG_HIGHMEM to support X86 32bit iomap_atomic() cruft. -+/** -+ * kmap - Map a page for long term usage -+ * @page: Pointer to the page to be mapped -+ * -+ * Returns: The virtual address of the mapping -+ * -+ * Can only be invoked from preemptible task context because on 32bit -+ * systems with CONFIG_HIGHMEM enabled this function might sleep. -+ * -+ * For systems with CONFIG_HIGHMEM=n and for pages in the low memory area -+ * this returns the virtual address of the direct kernel mapping. -+ * -+ * The returned virtual address is globally visible and valid up to the -+ * point where it is unmapped via kunmap(). The pointer can be handed to -+ * other contexts. -+ * -+ * For highmem pages on 32bit systems this can be slow as the mapping space -+ * is limited and protected by a global lock. In case that there is no -+ * mapping slot available the function blocks until a slot is released via -+ * kunmap(). - */ --#ifdef CONFIG_KMAP_LOCAL --void *__kmap_local_pfn_prot(unsigned long pfn, pgprot_t prot); --void *__kmap_local_page_prot(struct page *page, pgprot_t prot); --void kunmap_local_indexed(void *vaddr); --#endif -- --#ifdef CONFIG_HIGHMEM --#include <asm/highmem.h> -+static inline void *kmap(struct page *page); - --#ifndef ARCH_HAS_KMAP_FLUSH_TLB --static inline void kmap_flush_tlb(unsigned long addr) { } --#endif -- --#ifndef kmap_prot --#define kmap_prot PAGE_KERNEL --#endif -- --void *kmap_high(struct page *page); --static inline void *kmap(struct page *page) --{ -- void *addr; -- -- might_sleep(); -- if (!PageHighMem(page)) -- addr = page_address(page); -- else -- addr = kmap_high(page); -- kmap_flush_tlb((unsigned long)addr); -- return addr; --} -+/** -+ * kunmap - Unmap the virtual address mapped by kmap() -+ * @addr: Virtual address to be unmapped -+ * -+ * Counterpart to kmap(). A NOOP for CONFIG_HIGHMEM=n and for mappings of -+ * pages in the low memory area. -+ */ -+static inline void kunmap(struct page *page); - --void kunmap_high(struct page *page); -+/** -+ * kmap_to_page - Get the page for a kmap'ed address -+ * @addr: The address to look up -+ * -+ * Returns: The page which is mapped to @addr. -+ */ -+static inline struct page *kmap_to_page(void *addr); - --static inline void kunmap(struct page *page) --{ -- might_sleep(); -- if (!PageHighMem(page)) -- return; -- kunmap_high(page); --} -+/** -+ * kmap_flush_unused - Flush all unused kmap mappings in order to -+ * remove stray mappings -+ */ -+static inline void kmap_flush_unused(void); - --/* -- * kmap_atomic/kunmap_atomic is significantly faster than kmap/kunmap because -- * no global lock is needed and because the kmap code must perform a global TLB -- * invalidation when the kmap pool wraps. -+/** -+ * kmap_atomic - Atomically map a page for temporary usage -+ * @page: Pointer to the page to be mapped -+ * -+ * Returns: The virtual address of the mapping -+ * -+ * Side effect: On return pagefaults and preemption are disabled. -+ * -+ * Can be invoked from any context. - * -- * However when holding an atomic kmap it is not legal to sleep, so atomic -- * kmaps are appropriate for short, tight code paths only. -+ * Requires careful handling when nesting multiple mappings because the map -+ * management is stack based. The unmap has to be in the reverse order of -+ * the map operation: - * -- * The use of kmap_atomic/kunmap_atomic is discouraged - kmap/kunmap -- * gives a more generic (and caching) interface. But kmap_atomic can -- * be used in IRQ contexts, so in some (very limited) cases we need -- * it. -+ * addr1 = kmap_atomic(page1); -+ * addr2 = kmap_atomic(page2); -+ * ... -+ * kunmap_atomic(addr2); -+ * kunmap_atomic(addr1); -+ * -+ * Unmapping addr1 before addr2 is invalid and causes malfunction. -+ * -+ * Contrary to kmap() mappings the mapping is only valid in the context of -+ * the caller and cannot be handed to other contexts. -+ * -+ * On CONFIG_HIGHMEM=n kernels and for low memory pages this returns the -+ * virtual address of the direct mapping. Only real highmem pages are -+ * temporarily mapped. -+ * -+ * While it is significantly faster than kmap() it comes with restrictions -+ * about the pointer validity and the side effects of disabling page faults -+ * and preemption. Use it only when absolutely necessary, e.g. from non -+ * preemptible contexts. - */ --static inline void *kmap_atomic_prot(struct page *page, pgprot_t prot) --{ -- preempt_disable(); -- pagefault_disable(); -- return __kmap_local_page_prot(page, prot); --} -+static inline void *kmap_atomic(struct page *page); - --static inline void *kmap_atomic(struct page *page) --{ -- return kmap_atomic_prot(page, kmap_prot); --} -- --static inline void *kmap_atomic_pfn(unsigned long pfn) --{ -- preempt_disable(); -- pagefault_disable(); -- return __kmap_local_pfn_prot(pfn, kmap_prot); --} -- --static inline void __kunmap_atomic(void *addr) --{ -- kunmap_local_indexed(addr); --} -- --/* declarations for linux/mm/highmem.c */ --unsigned int nr_free_highpages(void); --extern atomic_long_t _totalhigh_pages; --static inline unsigned long totalhigh_pages(void) --{ -- return (unsigned long)atomic_long_read(&_totalhigh_pages); --} -- --static inline void totalhigh_pages_inc(void) --{ -- atomic_long_inc(&_totalhigh_pages); --} -- --static inline void totalhigh_pages_add(long count) --{ -- atomic_long_add(count, &_totalhigh_pages); --} -- --void kmap_flush_unused(void); -- --struct page *kmap_to_page(void *addr); -- --#else /* CONFIG_HIGHMEM */ -- --static inline unsigned int nr_free_highpages(void) { return 0; } -- --static inline struct page *kmap_to_page(void *addr) --{ -- return virt_to_page(addr); --} -- --static inline unsigned long totalhigh_pages(void) { return 0UL; } -+/** -+ * kunmap_atomic - Unmap the virtual address mapped by kmap_atomic() -+ * @addr: Virtual address to be unmapped -+ * -+ * Counterpart to kmap_atomic(). -+ * -+ * Undoes the side effects of kmap_atomic(), i.e. reenabling pagefaults and -+ * preemption. -+ * -+ * Other than that a NOOP for CONFIG_HIGHMEM=n and for mappings of pages -+ * in the low memory area. For real highmen pages the mapping which was -+ * established with kmap_atomic() is destroyed. -+ */ - --static inline void *kmap(struct page *page) --{ -- might_sleep(); -- return page_address(page); --} -+/* Highmem related interfaces for management code */ -+static inline unsigned int nr_free_highpages(void); -+static inline unsigned long totalhigh_pages(void); - --static inline void kunmap_high(struct page *page) -+#ifndef ARCH_HAS_FLUSH_ANON_PAGE -+static inline void flush_anon_page(struct vm_area_struct *vma, struct page *page, unsigned long vmaddr) - { - } -- --static inline void kunmap(struct page *page) --{ --#ifdef ARCH_HAS_FLUSH_ON_KUNMAP -- kunmap_flush_on_unmap(page_address(page)); - #endif --} - --static inline void *kmap_atomic(struct page *page) -+#ifndef ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE -+static inline void flush_kernel_dcache_page(struct page *page) - { -- preempt_disable(); -- pagefault_disable(); -- return page_address(page); - } -- --static inline void *kmap_atomic_prot(struct page *page, pgprot_t prot) -+static inline void flush_kernel_vmap_range(void *vaddr, int size) - { -- return kmap_atomic(page); - } -- --static inline void *kmap_atomic_pfn(unsigned long pfn) -+static inline void invalidate_kernel_vmap_range(void *vaddr, int size) - { -- return kmap_atomic(pfn_to_page(pfn)); - } -- --static inline void __kunmap_atomic(void *addr) --{ -- /* -- * Mostly nothing to do in the CONFIG_HIGHMEM=n case as kunmap_atomic() -- * handles re-enabling faults and preemption -- */ --#ifdef ARCH_HAS_FLUSH_ON_KUNMAP -- kunmap_flush_on_unmap(addr); - #endif --} -- --#define kmap_flush_unused() do {} while(0) -- -- --#endif /* CONFIG_HIGHMEM */ -- --/* -- * Prevent people trying to call kunmap_atomic() as if it were kunmap() -- * kunmap_atomic() should get the return value of kmap_atomic, not the page. -- */ --#define kunmap_atomic(__addr) \ --do { \ -- BUILD_BUG_ON(__same_type((__addr), struct page *)); \ -- __kunmap_atomic(__addr); \ -- pagefault_enable(); \ -- preempt_enable(); \ --} while (0) - - /* when CONFIG_HIGHMEM is not set these will be plain clear/copy_page */ - #ifndef clear_user_highpage -diff --git a/mm/highmem.c b/mm/highmem.c -index 499dfafd36b7..54bd233846c9 100644 ---- a/mm/highmem.c -+++ b/mm/highmem.c -@@ -104,7 +104,7 @@ static inline wait_queue_head_t *get_pkmap_wait_queue_head(unsigned int color) - atomic_long_t _totalhigh_pages __read_mostly; - EXPORT_SYMBOL(_totalhigh_pages); - --unsigned int nr_free_highpages (void) -+unsigned int __nr_free_highpages (void) - { - struct zone *zone; - unsigned int pages = 0; -@@ -141,7 +141,7 @@ pte_t * pkmap_page_table; - do { spin_unlock(&kmap_lock); (void)(flags); } while (0) - #endif - --struct page *kmap_to_page(void *vaddr) -+struct page *__kmap_to_page(void *vaddr) - { - unsigned long addr = (unsigned long)vaddr; - -@@ -152,7 +152,7 @@ struct page *kmap_to_page(void *vaddr) - - return virt_to_page(addr); - } --EXPORT_SYMBOL(kmap_to_page); -+EXPORT_SYMBOL(__kmap_to_page); - - static void flush_all_zero_pkmaps(void) - { -@@ -194,10 +194,7 @@ static void flush_all_zero_pkmaps(void) - flush_tlb_kernel_range(PKMAP_ADDR(0), PKMAP_ADDR(LAST_PKMAP)); - } - --/** -- * kmap_flush_unused - flush all unused kmap mappings in order to remove stray mappings -- */ --void kmap_flush_unused(void) -+void __kmap_flush_unused(void) - { - lock_kmap(); - flush_all_zero_pkmaps(); --- -2.30.2 - diff --git a/debian/patches-rt/0046-sched-Make-migrate_disable-enable-independent-of-RT.patch b/debian/patches-rt/0046-sched-Make-migrate_disable-enable-independent-of-RT.patch deleted file mode 100644 index 51b9422df..000000000 --- a/debian/patches-rt/0046-sched-Make-migrate_disable-enable-independent-of-RT.patch +++ /dev/null @@ -1,293 +0,0 @@ -From 225cf76c227b6ff72c7568163a70338a1a4844bc Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:35 +0100 -Subject: [PATCH 046/296] sched: Make migrate_disable/enable() independent of - RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Now that the scheduler can deal with migrate disable properly, there is no -real compelling reason to make it only available for RT. - -There are quite some code pathes which needlessly disable preemption in -order to prevent migration and some constructs like kmap_atomic() enforce -it implicitly. - -Making it available independent of RT allows to provide a preemptible -variant of kmap_atomic() and makes the code more consistent in general. - -FIXME: Rework the comment in preempt.h - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Cc: Peter Zijlstra <peterz@infradead.org> -Cc: Ingo Molnar <mingo@kernel.org> -Cc: Juri Lelli <juri.lelli@redhat.com> -Cc: Vincent Guittot <vincent.guittot@linaro.org> -Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> -Cc: Steven Rostedt <rostedt@goodmis.org> -Cc: Ben Segall <bsegall@google.com> -Cc: Mel Gorman <mgorman@suse.de> -Cc: Daniel Bristot de Oliveira <bristot@redhat.com> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/kernel.h | 21 ++++++++++++------- - include/linux/preempt.h | 38 +++------------------------------- - include/linux/sched.h | 2 +- - kernel/sched/core.c | 45 ++++++++++++++++++++++++++++++++--------- - kernel/sched/sched.h | 4 ++-- - lib/smp_processor_id.c | 2 +- - 6 files changed, 56 insertions(+), 56 deletions(-) - -diff --git a/include/linux/kernel.h b/include/linux/kernel.h -index 2f05e9128201..665837f9a831 100644 ---- a/include/linux/kernel.h -+++ b/include/linux/kernel.h -@@ -204,6 +204,7 @@ extern int _cond_resched(void); - extern void ___might_sleep(const char *file, int line, int preempt_offset); - extern void __might_sleep(const char *file, int line, int preempt_offset); - extern void __cant_sleep(const char *file, int line, int preempt_offset); -+extern void __cant_migrate(const char *file, int line); - - /** - * might_sleep - annotation for functions that can sleep -@@ -227,6 +228,18 @@ extern void __cant_sleep(const char *file, int line, int preempt_offset); - # define cant_sleep() \ - do { __cant_sleep(__FILE__, __LINE__, 0); } while (0) - # define sched_annotate_sleep() (current->task_state_change = 0) -+ -+/** -+ * cant_migrate - annotation for functions that cannot migrate -+ * -+ * Will print a stack trace if executed in code which is migratable -+ */ -+# define cant_migrate() \ -+ do { \ -+ if (IS_ENABLED(CONFIG_SMP)) \ -+ __cant_migrate(__FILE__, __LINE__); \ -+ } while (0) -+ - /** - * non_block_start - annotate the start of section where sleeping is prohibited - * -@@ -251,6 +264,7 @@ extern void __cant_sleep(const char *file, int line, int preempt_offset); - int preempt_offset) { } - # define might_sleep() do { might_resched(); } while (0) - # define cant_sleep() do { } while (0) -+# define cant_migrate() do { } while (0) - # define sched_annotate_sleep() do { } while (0) - # define non_block_start() do { } while (0) - # define non_block_end() do { } while (0) -@@ -258,13 +272,6 @@ extern void __cant_sleep(const char *file, int line, int preempt_offset); - - #define might_sleep_if(cond) do { if (cond) might_sleep(); } while (0) - --#ifndef CONFIG_PREEMPT_RT --# define cant_migrate() cant_sleep() --#else -- /* Placeholder for now */ --# define cant_migrate() do { } while (0) --#endif -- - /** - * abs - return absolute value of an argument - * @x: the value. If it is unsigned type, it is converted to signed type first. -diff --git a/include/linux/preempt.h b/include/linux/preempt.h -index 8b43922e65df..6df63cbe8bb0 100644 ---- a/include/linux/preempt.h -+++ b/include/linux/preempt.h -@@ -322,7 +322,7 @@ static inline void preempt_notifier_init(struct preempt_notifier *notifier, - - #endif - --#if defined(CONFIG_SMP) && defined(CONFIG_PREEMPT_RT) -+#ifdef CONFIG_SMP - - /* - * Migrate-Disable and why it is undesired. -@@ -382,43 +382,11 @@ static inline void preempt_notifier_init(struct preempt_notifier *notifier, - extern void migrate_disable(void); - extern void migrate_enable(void); - --#elif defined(CONFIG_PREEMPT_RT) -+#else - - static inline void migrate_disable(void) { } - static inline void migrate_enable(void) { } - --#else /* !CONFIG_PREEMPT_RT */ -- --/** -- * migrate_disable - Prevent migration of the current task -- * -- * Maps to preempt_disable() which also disables preemption. Use -- * migrate_disable() to annotate that the intent is to prevent migration, -- * but not necessarily preemption. -- * -- * Can be invoked nested like preempt_disable() and needs the corresponding -- * number of migrate_enable() invocations. -- */ --static __always_inline void migrate_disable(void) --{ -- preempt_disable(); --} -- --/** -- * migrate_enable - Allow migration of the current task -- * -- * Counterpart to migrate_disable(). -- * -- * As migrate_disable() can be invoked nested, only the outermost invocation -- * reenables migration. -- * -- * Currently mapped to preempt_enable(). -- */ --static __always_inline void migrate_enable(void) --{ -- preempt_enable(); --} -- --#endif /* CONFIG_SMP && CONFIG_PREEMPT_RT */ -+#endif /* CONFIG_SMP */ - - #endif /* __LINUX_PREEMPT_H */ -diff --git a/include/linux/sched.h b/include/linux/sched.h -index b47c446ecf48..942b87f80cc7 100644 ---- a/include/linux/sched.h -+++ b/include/linux/sched.h -@@ -723,7 +723,7 @@ struct task_struct { - const cpumask_t *cpus_ptr; - cpumask_t cpus_mask; - void *migration_pending; --#if defined(CONFIG_SMP) && defined(CONFIG_PREEMPT_RT) -+#ifdef CONFIG_SMP - unsigned short migration_disabled; - #endif - unsigned short migration_flags; -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index 9429059d96e3..4dad5b392f86 100644 ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -1694,8 +1694,6 @@ void check_preempt_curr(struct rq *rq, struct task_struct *p, int flags) - - #ifdef CONFIG_SMP - --#ifdef CONFIG_PREEMPT_RT -- - static void - __do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask, u32 flags); - -@@ -1770,8 +1768,6 @@ static inline bool rq_has_pinned_tasks(struct rq *rq) - return rq->nr_pinned; - } - --#endif -- - /* - * Per-CPU kthreads are allowed to run on !active && online CPUs, see - * __set_cpus_allowed_ptr() and select_fallback_rq(). -@@ -2852,7 +2848,7 @@ void sched_set_stop_task(int cpu, struct task_struct *stop) - } - } - --#else -+#else /* CONFIG_SMP */ - - static inline int __set_cpus_allowed_ptr(struct task_struct *p, - const struct cpumask *new_mask, -@@ -2861,10 +2857,6 @@ static inline int __set_cpus_allowed_ptr(struct task_struct *p, - return set_cpus_allowed_ptr(p, new_mask); - } - --#endif /* CONFIG_SMP */ -- --#if !defined(CONFIG_SMP) || !defined(CONFIG_PREEMPT_RT) -- - static inline void migrate_disable_switch(struct rq *rq, struct task_struct *p) { } - - static inline bool rq_has_pinned_tasks(struct rq *rq) -@@ -2872,7 +2864,7 @@ static inline bool rq_has_pinned_tasks(struct rq *rq) - return false; - } - --#endif -+#endif /* !CONFIG_SMP */ - - static void - ttwu_stat(struct task_struct *p, int cpu, int wake_flags) -@@ -7898,6 +7890,39 @@ void __cant_sleep(const char *file, int line, int preempt_offset) - add_taint(TAINT_WARN, LOCKDEP_STILL_OK); - } - EXPORT_SYMBOL_GPL(__cant_sleep); -+ -+#ifdef CONFIG_SMP -+void __cant_migrate(const char *file, int line) -+{ -+ static unsigned long prev_jiffy; -+ -+ if (irqs_disabled()) -+ return; -+ -+ if (is_migration_disabled(current)) -+ return; -+ -+ if (!IS_ENABLED(CONFIG_PREEMPT_COUNT)) -+ return; -+ -+ if (preempt_count() > 0) -+ return; -+ -+ if (time_before(jiffies, prev_jiffy + HZ) && prev_jiffy) -+ return; -+ prev_jiffy = jiffies; -+ -+ pr_err("BUG: assuming non migratable context at %s:%d\n", file, line); -+ pr_err("in_atomic(): %d, irqs_disabled(): %d, migration_disabled() %u pid: %d, name: %s\n", -+ in_atomic(), irqs_disabled(), is_migration_disabled(current), -+ current->pid, current->comm); -+ -+ debug_show_held_locks(current); -+ dump_stack(); -+ add_taint(TAINT_WARN, LOCKDEP_STILL_OK); -+} -+EXPORT_SYMBOL_GPL(__cant_migrate); -+#endif - #endif - - #ifdef CONFIG_MAGIC_SYSRQ -diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h -index fb8af162ccaf..605f803937f0 100644 ---- a/kernel/sched/sched.h -+++ b/kernel/sched/sched.h -@@ -1049,7 +1049,7 @@ struct rq { - struct cpuidle_state *idle_state; - #endif - --#if defined(CONFIG_PREEMPT_RT) && defined(CONFIG_SMP) -+#ifdef CONFIG_SMP - unsigned int nr_pinned; - #endif - unsigned int push_busy; -@@ -1085,7 +1085,7 @@ static inline int cpu_of(struct rq *rq) - - static inline bool is_migration_disabled(struct task_struct *p) - { --#if defined(CONFIG_SMP) && defined(CONFIG_PREEMPT_RT) -+#ifdef CONFIG_SMP - return p->migration_disabled; - #else - return false; -diff --git a/lib/smp_processor_id.c b/lib/smp_processor_id.c -index faaa927ac2c8..1c1dbd300325 100644 ---- a/lib/smp_processor_id.c -+++ b/lib/smp_processor_id.c -@@ -26,7 +26,7 @@ unsigned int check_preemption_disabled(const char *what1, const char *what2) - if (current->nr_cpus_allowed == 1) - goto out; - --#if defined(CONFIG_SMP) && defined(CONFIG_PREEMPT_RT) -+#ifdef CONFIG_SMP - if (current->migration_disabled) - goto out; - #endif --- -2.30.2 - diff --git a/debian/patches-rt/0047-sched-highmem-Store-local-kmaps-in-task-struct.patch b/debian/patches-rt/0047-sched-highmem-Store-local-kmaps-in-task-struct.patch deleted file mode 100644 index b507deaef..000000000 --- a/debian/patches-rt/0047-sched-highmem-Store-local-kmaps-in-task-struct.patch +++ /dev/null @@ -1,309 +0,0 @@ -From 91579d9fbb60f3539968f90345d78d9bd9e18a30 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:36 +0100 -Subject: [PATCH 047/296] sched: highmem: Store local kmaps in task struct -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Instead of storing the map per CPU provide and use per task storage. That -prepares for local kmaps which are preemptible. - -The context switch code is preparatory and not yet in use because -kmap_atomic() runs with preemption disabled. Will be made usable in the -next step. - -The context switch logic is safe even when an interrupt happens after -clearing or before restoring the kmaps. The kmap index in task struct is -not modified so any nesting kmap in an interrupt will use unused indices -and on return the counter is the same as before. - -Also add an assert into the return to user space code. Going back to user -space with an active kmap local is a nono. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/highmem-internal.h | 10 ++++ - include/linux/sched.h | 9 +++ - kernel/entry/common.c | 2 + - kernel/fork.c | 1 + - kernel/sched/core.c | 18 ++++++ - mm/highmem.c | 99 ++++++++++++++++++++++++++++---- - 6 files changed, 129 insertions(+), 10 deletions(-) - -diff --git a/include/linux/highmem-internal.h b/include/linux/highmem-internal.h -index 6ceed907b14e..c5a22177db85 100644 ---- a/include/linux/highmem-internal.h -+++ b/include/linux/highmem-internal.h -@@ -9,6 +9,16 @@ - void *__kmap_local_pfn_prot(unsigned long pfn, pgprot_t prot); - void *__kmap_local_page_prot(struct page *page, pgprot_t prot); - void kunmap_local_indexed(void *vaddr); -+void kmap_local_fork(struct task_struct *tsk); -+void __kmap_local_sched_out(void); -+void __kmap_local_sched_in(void); -+static inline void kmap_assert_nomap(void) -+{ -+ DEBUG_LOCKS_WARN_ON(current->kmap_ctrl.idx); -+} -+#else -+static inline void kmap_local_fork(struct task_struct *tsk) { } -+static inline void kmap_assert_nomap(void) { } - #endif - - #ifdef CONFIG_HIGHMEM -diff --git a/include/linux/sched.h b/include/linux/sched.h -index 942b87f80cc7..6e324d455a0c 100644 ---- a/include/linux/sched.h -+++ b/include/linux/sched.h -@@ -34,6 +34,7 @@ - #include <linux/rseq.h> - #include <linux/seqlock.h> - #include <linux/kcsan.h> -+#include <asm/kmap_size.h> - - /* task_struct member predeclarations (sorted alphabetically): */ - struct audit_context; -@@ -637,6 +638,13 @@ struct wake_q_node { - struct wake_q_node *next; - }; - -+struct kmap_ctrl { -+#ifdef CONFIG_KMAP_LOCAL -+ int idx; -+ pte_t pteval[KM_MAX_IDX]; -+#endif -+}; -+ - struct task_struct { - #ifdef CONFIG_THREAD_INFO_IN_TASK - /* -@@ -1316,6 +1324,7 @@ struct task_struct { - unsigned int sequential_io; - unsigned int sequential_io_avg; - #endif -+ struct kmap_ctrl kmap_ctrl; - #ifdef CONFIG_DEBUG_ATOMIC_SLEEP - unsigned long task_state_change; - #endif -diff --git a/kernel/entry/common.c b/kernel/entry/common.c -index e289e6773292..59c666d9d43c 100644 ---- a/kernel/entry/common.c -+++ b/kernel/entry/common.c -@@ -2,6 +2,7 @@ - - #include <linux/context_tracking.h> - #include <linux/entry-common.h> -+#include <linux/highmem.h> - #include <linux/livepatch.h> - #include <linux/audit.h> - -@@ -194,6 +195,7 @@ static void exit_to_user_mode_prepare(struct pt_regs *regs) - - /* Ensure that the address limit is intact and no locks are held */ - addr_limit_user_check(); -+ kmap_assert_nomap(); - lockdep_assert_irqs_disabled(); - lockdep_sys_exit(); - } -diff --git a/kernel/fork.c b/kernel/fork.c -index 7c044d377926..356104c1e0a3 100644 ---- a/kernel/fork.c -+++ b/kernel/fork.c -@@ -930,6 +930,7 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node) - account_kernel_stack(tsk, 1); - - kcov_task_init(tsk); -+ kmap_local_fork(tsk); - - #ifdef CONFIG_FAULT_INJECTION - tsk->fail_nth = 0; -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index 4dad5b392f86..6763a6ec8715 100644 ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -4068,6 +4068,22 @@ static inline void finish_lock_switch(struct rq *rq) - # define finish_arch_post_lock_switch() do { } while (0) - #endif - -+static inline void kmap_local_sched_out(void) -+{ -+#ifdef CONFIG_KMAP_LOCAL -+ if (unlikely(current->kmap_ctrl.idx)) -+ __kmap_local_sched_out(); -+#endif -+} -+ -+static inline void kmap_local_sched_in(void) -+{ -+#ifdef CONFIG_KMAP_LOCAL -+ if (unlikely(current->kmap_ctrl.idx)) -+ __kmap_local_sched_in(); -+#endif -+} -+ - /** - * prepare_task_switch - prepare to switch tasks - * @rq: the runqueue preparing to switch -@@ -4090,6 +4106,7 @@ prepare_task_switch(struct rq *rq, struct task_struct *prev, - perf_event_task_sched_out(prev, next); - rseq_preempt(prev); - fire_sched_out_preempt_notifiers(prev, next); -+ kmap_local_sched_out(); - prepare_task(next); - prepare_arch_switch(next); - } -@@ -4156,6 +4173,7 @@ static struct rq *finish_task_switch(struct task_struct *prev) - finish_lock_switch(rq); - finish_arch_post_lock_switch(); - kcov_finish_switch(current); -+ kmap_local_sched_in(); - - fire_sched_in_preempt_notifiers(current); - /* -diff --git a/mm/highmem.c b/mm/highmem.c -index 54bd233846c9..d7a1c80001d0 100644 ---- a/mm/highmem.c -+++ b/mm/highmem.c -@@ -365,8 +365,6 @@ EXPORT_SYMBOL(kunmap_high); - - #include <asm/kmap_size.h> - --static DEFINE_PER_CPU(int, __kmap_local_idx); -- - /* - * With DEBUG_HIGHMEM the stack depth is doubled and every second - * slot is unused which acts as a guard page -@@ -379,23 +377,21 @@ static DEFINE_PER_CPU(int, __kmap_local_idx); - - static inline int kmap_local_idx_push(void) - { -- int idx = __this_cpu_add_return(__kmap_local_idx, KM_INCR) - 1; -- - WARN_ON_ONCE(in_irq() && !irqs_disabled()); -- BUG_ON(idx >= KM_MAX_IDX); -- return idx; -+ current->kmap_ctrl.idx += KM_INCR; -+ BUG_ON(current->kmap_ctrl.idx >= KM_MAX_IDX); -+ return current->kmap_ctrl.idx - 1; - } - - static inline int kmap_local_idx(void) - { -- return __this_cpu_read(__kmap_local_idx) - 1; -+ return current->kmap_ctrl.idx - 1; - } - - static inline void kmap_local_idx_pop(void) - { -- int idx = __this_cpu_sub_return(__kmap_local_idx, KM_INCR); -- -- BUG_ON(idx < 0); -+ current->kmap_ctrl.idx -= KM_INCR; -+ BUG_ON(current->kmap_ctrl.idx < 0); - } - - #ifndef arch_kmap_local_post_map -@@ -461,6 +457,7 @@ void *__kmap_local_pfn_prot(unsigned long pfn, pgprot_t prot) - pteval = pfn_pte(pfn, prot); - set_pte_at(&init_mm, vaddr, kmap_pte - idx, pteval); - arch_kmap_local_post_map(vaddr, pteval); -+ current->kmap_ctrl.pteval[kmap_local_idx()] = pteval; - preempt_enable(); - - return (void *)vaddr; -@@ -505,10 +502,92 @@ void kunmap_local_indexed(void *vaddr) - arch_kmap_local_pre_unmap(addr); - pte_clear(&init_mm, addr, kmap_pte - idx); - arch_kmap_local_post_unmap(addr); -+ current->kmap_ctrl.pteval[kmap_local_idx()] = __pte(0); - kmap_local_idx_pop(); - preempt_enable(); - } - EXPORT_SYMBOL(kunmap_local_indexed); -+ -+/* -+ * Invoked before switch_to(). This is safe even when during or after -+ * clearing the maps an interrupt which needs a kmap_local happens because -+ * the task::kmap_ctrl.idx is not modified by the unmapping code so a -+ * nested kmap_local will use the next unused index and restore the index -+ * on unmap. The already cleared kmaps of the outgoing task are irrelevant -+ * because the interrupt context does not know about them. The same applies -+ * when scheduling back in for an interrupt which happens before the -+ * restore is complete. -+ */ -+void __kmap_local_sched_out(void) -+{ -+ struct task_struct *tsk = current; -+ pte_t *kmap_pte = kmap_get_pte(); -+ int i; -+ -+ /* Clear kmaps */ -+ for (i = 0; i < tsk->kmap_ctrl.idx; i++) { -+ pte_t pteval = tsk->kmap_ctrl.pteval[i]; -+ unsigned long addr; -+ int idx; -+ -+ /* With debug all even slots are unmapped and act as guard */ -+ if (IS_ENABLED(CONFIG_DEBUG_HIGHMEM) && !(i & 0x01)) { -+ WARN_ON_ONCE(!pte_none(pteval)); -+ continue; -+ } -+ if (WARN_ON_ONCE(pte_none(pteval))) -+ continue; -+ -+ /* -+ * This is a horrible hack for XTENSA to calculate the -+ * coloured PTE index. Uses the PFN encoded into the pteval -+ * and the map index calculation because the actual mapped -+ * virtual address is not stored in task::kmap_ctrl. -+ * For any sane architecture this is optimized out. -+ */ -+ idx = arch_kmap_local_map_idx(i, pte_pfn(pteval)); -+ -+ addr = __fix_to_virt(FIX_KMAP_BEGIN + idx); -+ arch_kmap_local_pre_unmap(addr); -+ pte_clear(&init_mm, addr, kmap_pte - idx); -+ arch_kmap_local_post_unmap(addr); -+ } -+} -+ -+void __kmap_local_sched_in(void) -+{ -+ struct task_struct *tsk = current; -+ pte_t *kmap_pte = kmap_get_pte(); -+ int i; -+ -+ /* Restore kmaps */ -+ for (i = 0; i < tsk->kmap_ctrl.idx; i++) { -+ pte_t pteval = tsk->kmap_ctrl.pteval[i]; -+ unsigned long addr; -+ int idx; -+ -+ /* With debug all even slots are unmapped and act as guard */ -+ if (IS_ENABLED(CONFIG_DEBUG_HIGHMEM) && !(i & 0x01)) { -+ WARN_ON_ONCE(!pte_none(pteval)); -+ continue; -+ } -+ if (WARN_ON_ONCE(pte_none(pteval))) -+ continue; -+ -+ /* See comment in __kmap_local_sched_out() */ -+ idx = arch_kmap_local_map_idx(i, pte_pfn(pteval)); -+ addr = __fix_to_virt(FIX_KMAP_BEGIN + idx); -+ set_pte_at(&init_mm, addr, kmap_pte - idx, pteval); -+ arch_kmap_local_post_map(addr, pteval); -+ } -+} -+ -+void kmap_local_fork(struct task_struct *tsk) -+{ -+ if (WARN_ON_ONCE(tsk->kmap_ctrl.idx)) -+ memset(&tsk->kmap_ctrl, 0, sizeof(tsk->kmap_ctrl)); -+} -+ - #endif - - #if defined(HASHED_PAGE_VIRTUAL) --- -2.30.2 - diff --git a/debian/patches-rt/0048-mm-highmem-Provide-kmap_local.patch b/debian/patches-rt/0048-mm-highmem-Provide-kmap_local.patch deleted file mode 100644 index 2027e35e9..000000000 --- a/debian/patches-rt/0048-mm-highmem-Provide-kmap_local.patch +++ /dev/null @@ -1,207 +0,0 @@ -From 2756999cde0cbec55ae52fd9f4f85f27b0ae6242 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:37 +0100 -Subject: [PATCH 048/296] mm/highmem: Provide kmap_local* -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Now that the kmap atomic index is stored in task struct provide a -preemptible variant. On context switch the maps of an outgoing task are -removed and the map of the incoming task are restored. That's obviously -slow, but highmem is slow anyway. - -The kmap_local.*() functions can be invoked from both preemptible and -atomic context. kmap local sections disable migration to keep the resulting -virtual mapping address correct, but disable neither pagefaults nor -preemption. - -A wholesale conversion of kmap_atomic to be fully preemptible is not -possible because some of the usage sites might rely on the preemption -disable for serialization or on the implicit pagefault disable. Needs to be -done on a case by case basis. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/highmem-internal.h | 48 ++++++++++++++++++++++++++++++++ - include/linux/highmem.h | 43 +++++++++++++++++----------- - mm/highmem.c | 6 ++++ - 3 files changed, 81 insertions(+), 16 deletions(-) - -diff --git a/include/linux/highmem-internal.h b/include/linux/highmem-internal.h -index c5a22177db85..1bbe96dc8be6 100644 ---- a/include/linux/highmem-internal.h -+++ b/include/linux/highmem-internal.h -@@ -68,6 +68,26 @@ static inline void kmap_flush_unused(void) - __kmap_flush_unused(); - } - -+static inline void *kmap_local_page(struct page *page) -+{ -+ return __kmap_local_page_prot(page, kmap_prot); -+} -+ -+static inline void *kmap_local_page_prot(struct page *page, pgprot_t prot) -+{ -+ return __kmap_local_page_prot(page, prot); -+} -+ -+static inline void *kmap_local_pfn(unsigned long pfn) -+{ -+ return __kmap_local_pfn_prot(pfn, kmap_prot); -+} -+ -+static inline void __kunmap_local(void *vaddr) -+{ -+ kunmap_local_indexed(vaddr); -+} -+ - static inline void *kmap_atomic_prot(struct page *page, pgprot_t prot) - { - preempt_disable(); -@@ -140,6 +160,28 @@ static inline void kunmap(struct page *page) - #endif - } - -+static inline void *kmap_local_page(struct page *page) -+{ -+ return page_address(page); -+} -+ -+static inline void *kmap_local_page_prot(struct page *page, pgprot_t prot) -+{ -+ return kmap_local_page(page); -+} -+ -+static inline void *kmap_local_pfn(unsigned long pfn) -+{ -+ return kmap_local_page(pfn_to_page(pfn)); -+} -+ -+static inline void __kunmap_local(void *addr) -+{ -+#ifdef ARCH_HAS_FLUSH_ON_KUNMAP -+ kunmap_flush_on_unmap(addr); -+#endif -+} -+ - static inline void *kmap_atomic(struct page *page) - { - preempt_disable(); -@@ -181,4 +223,10 @@ do { \ - __kunmap_atomic(__addr); \ - } while (0) - -+#define kunmap_local(__addr) \ -+do { \ -+ BUILD_BUG_ON(__same_type((__addr), struct page *)); \ -+ __kunmap_local(__addr); \ -+} while (0) -+ - #endif -diff --git a/include/linux/highmem.h b/include/linux/highmem.h -index 7d098bd621f6..f597830f26b4 100644 ---- a/include/linux/highmem.h -+++ b/include/linux/highmem.h -@@ -60,24 +60,22 @@ static inline struct page *kmap_to_page(void *addr); - static inline void kmap_flush_unused(void); - - /** -- * kmap_atomic - Atomically map a page for temporary usage -+ * kmap_local_page - Map a page for temporary usage - * @page: Pointer to the page to be mapped - * - * Returns: The virtual address of the mapping - * -- * Side effect: On return pagefaults and preemption are disabled. -- * - * Can be invoked from any context. - * - * Requires careful handling when nesting multiple mappings because the map - * management is stack based. The unmap has to be in the reverse order of - * the map operation: - * -- * addr1 = kmap_atomic(page1); -- * addr2 = kmap_atomic(page2); -+ * addr1 = kmap_local_page(page1); -+ * addr2 = kmap_local_page(page2); - * ... -- * kunmap_atomic(addr2); -- * kunmap_atomic(addr1); -+ * kunmap_local(addr2); -+ * kunmap_local(addr1); - * - * Unmapping addr1 before addr2 is invalid and causes malfunction. - * -@@ -88,10 +86,26 @@ static inline void kmap_flush_unused(void); - * virtual address of the direct mapping. Only real highmem pages are - * temporarily mapped. - * -- * While it is significantly faster than kmap() it comes with restrictions -- * about the pointer validity and the side effects of disabling page faults -- * and preemption. Use it only when absolutely necessary, e.g. from non -- * preemptible contexts. -+ * While it is significantly faster than kmap() for the higmem case it -+ * comes with restrictions about the pointer validity. Only use when really -+ * necessary. -+ * -+ * On HIGHMEM enabled systems mapping a highmem page has the side effect of -+ * disabling migration in order to keep the virtual address stable across -+ * preemption. No caller of kmap_local_page() can rely on this side effect. -+ */ -+static inline void *kmap_local_page(struct page *page); -+ -+/** -+ * kmap_atomic - Atomically map a page for temporary usage - Deprecated! -+ * @page: Pointer to the page to be mapped -+ * -+ * Returns: The virtual address of the mapping -+ * -+ * Effectively a wrapper around kmap_local_page() which disables pagefaults -+ * and preemption. -+ * -+ * Do not use in new code. Use kmap_local_page() instead. - */ - static inline void *kmap_atomic(struct page *page); - -@@ -101,12 +115,9 @@ static inline void *kmap_atomic(struct page *page); - * - * Counterpart to kmap_atomic(). - * -- * Undoes the side effects of kmap_atomic(), i.e. reenabling pagefaults and -+ * Effectively a wrapper around kunmap_local() which additionally undoes -+ * the side effects of kmap_atomic(), i.e. reenabling pagefaults and - * preemption. -- * -- * Other than that a NOOP for CONFIG_HIGHMEM=n and for mappings of pages -- * in the low memory area. For real highmen pages the mapping which was -- * established with kmap_atomic() is destroyed. - */ - - /* Highmem related interfaces for management code */ -diff --git a/mm/highmem.c b/mm/highmem.c -index d7a1c80001d0..8db577e5290c 100644 ---- a/mm/highmem.c -+++ b/mm/highmem.c -@@ -450,6 +450,11 @@ void *__kmap_local_pfn_prot(unsigned long pfn, pgprot_t prot) - unsigned long vaddr; - int idx; - -+ /* -+ * Disable migration so resulting virtual address is stable -+ * accross preemption. -+ */ -+ migrate_disable(); - preempt_disable(); - idx = arch_kmap_local_map_idx(kmap_local_idx_push(), pfn); - vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); -@@ -505,6 +510,7 @@ void kunmap_local_indexed(void *vaddr) - current->kmap_ctrl.pteval[kmap_local_idx()] = __pte(0); - kmap_local_idx_pop(); - preempt_enable(); -+ migrate_enable(); - } - EXPORT_SYMBOL(kunmap_local_indexed); - --- -2.30.2 - diff --git a/debian/patches-rt/0049-io-mapping-Provide-iomap_local-variant.patch b/debian/patches-rt/0049-io-mapping-Provide-iomap_local-variant.patch deleted file mode 100644 index fd7f601ea..000000000 --- a/debian/patches-rt/0049-io-mapping-Provide-iomap_local-variant.patch +++ /dev/null @@ -1,179 +0,0 @@ -From b6327019d18848a97ba04ec8c43ce0a446a6fe1f Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:38 +0100 -Subject: [PATCH 049/296] io-mapping: Provide iomap_local variant -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Similar to kmap local provide a iomap local variant which only disables -migration, but neither disables pagefaults nor preemption. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - Documentation/driver-api/io-mapping.rst | 74 +++++++++++++++---------- - include/linux/io-mapping.h | 30 +++++++++- - 2 files changed, 73 insertions(+), 31 deletions(-) - -diff --git a/Documentation/driver-api/io-mapping.rst b/Documentation/driver-api/io-mapping.rst -index e33b88268554..a0cfb15988df 100644 ---- a/Documentation/driver-api/io-mapping.rst -+++ b/Documentation/driver-api/io-mapping.rst -@@ -20,55 +20,71 @@ A mapping object is created during driver initialization using:: - mappable, while 'size' indicates how large a mapping region to - enable. Both are in bytes. - --This _wc variant provides a mapping which may only be used --with the io_mapping_map_atomic_wc or io_mapping_map_wc. -+This _wc variant provides a mapping which may only be used with -+io_mapping_map_atomic_wc(), io_mapping_map_local_wc() or -+io_mapping_map_wc(). - --With this mapping object, individual pages can be mapped either atomically --or not, depending on the necessary scheduling environment. Of course, atomic --maps are more efficient:: -+With this mapping object, individual pages can be mapped either temporarily -+or long term, depending on the requirements. Of course, temporary maps are -+more efficient. They come in two flavours:: -+ -+ void *io_mapping_map_local_wc(struct io_mapping *mapping, -+ unsigned long offset) - - void *io_mapping_map_atomic_wc(struct io_mapping *mapping, - unsigned long offset) - --'offset' is the offset within the defined mapping region. --Accessing addresses beyond the region specified in the --creation function yields undefined results. Using an offset --which is not page aligned yields an undefined result. The --return value points to a single page in CPU address space. -+'offset' is the offset within the defined mapping region. Accessing -+addresses beyond the region specified in the creation function yields -+undefined results. Using an offset which is not page aligned yields an -+undefined result. The return value points to a single page in CPU address -+space. - --This _wc variant returns a write-combining map to the --page and may only be used with mappings created by --io_mapping_create_wc -+This _wc variant returns a write-combining map to the page and may only be -+used with mappings created by io_mapping_create_wc() - --Note that the task may not sleep while holding this page --mapped. -+Temporary mappings are only valid in the context of the caller. The mapping -+is not guaranteed to be globaly visible. - --:: -+io_mapping_map_local_wc() has a side effect on X86 32bit as it disables -+migration to make the mapping code work. No caller can rely on this side -+effect. - -- void io_mapping_unmap_atomic(void *vaddr) -+io_mapping_map_atomic_wc() has the side effect of disabling preemption and -+pagefaults. Don't use in new code. Use io_mapping_map_local_wc() instead. - --'vaddr' must be the value returned by the last --io_mapping_map_atomic_wc call. This unmaps the specified --page and allows the task to sleep once again. -+Nested mappings need to be undone in reverse order because the mapping -+code uses a stack for keeping track of them:: - --If you need to sleep while holding the lock, you can use the non-atomic --variant, although they may be significantly slower. -+ addr1 = io_mapping_map_local_wc(map1, offset1); -+ addr2 = io_mapping_map_local_wc(map2, offset2); -+ ... -+ io_mapping_unmap_local(addr2); -+ io_mapping_unmap_local(addr1); - --:: -+The mappings are released with:: -+ -+ void io_mapping_unmap_local(void *vaddr) -+ void io_mapping_unmap_atomic(void *vaddr) -+ -+'vaddr' must be the value returned by the last io_mapping_map_local_wc() or -+io_mapping_map_atomic_wc() call. This unmaps the specified mapping and -+undoes the side effects of the mapping functions. -+ -+If you need to sleep while holding a mapping, you can use the regular -+variant, although this may be significantly slower:: - - void *io_mapping_map_wc(struct io_mapping *mapping, - unsigned long offset) - --This works like io_mapping_map_atomic_wc except it allows --the task to sleep while holding the page mapped. -- -+This works like io_mapping_map_atomic/local_wc() except it has no side -+effects and the pointer is globaly visible. - --:: -+The mappings are released with:: - - void io_mapping_unmap(void *vaddr) - --This works like io_mapping_unmap_atomic, except it is used --for pages mapped with io_mapping_map_wc. -+Use for pages mapped with io_mapping_map_wc(). - - At driver close time, the io_mapping object must be freed:: - -diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h -index 60e7c83e4904..c093e81310a9 100644 ---- a/include/linux/io-mapping.h -+++ b/include/linux/io-mapping.h -@@ -82,6 +82,21 @@ io_mapping_unmap_atomic(void __iomem *vaddr) - preempt_enable(); - } - -+static inline void __iomem * -+io_mapping_map_local_wc(struct io_mapping *mapping, unsigned long offset) -+{ -+ resource_size_t phys_addr; -+ -+ BUG_ON(offset >= mapping->size); -+ phys_addr = mapping->base + offset; -+ return __iomap_local_pfn_prot(PHYS_PFN(phys_addr), mapping->prot); -+} -+ -+static inline void io_mapping_unmap_local(void __iomem *vaddr) -+{ -+ kunmap_local_indexed((void __force *)vaddr); -+} -+ - static inline void __iomem * - io_mapping_map_wc(struct io_mapping *mapping, - unsigned long offset, -@@ -101,7 +116,7 @@ io_mapping_unmap(void __iomem *vaddr) - iounmap(vaddr); - } - --#else -+#else /* HAVE_ATOMIC_IOMAP */ - - #include <linux/uaccess.h> - -@@ -166,7 +181,18 @@ io_mapping_unmap_atomic(void __iomem *vaddr) - preempt_enable(); - } - --#endif /* HAVE_ATOMIC_IOMAP */ -+static inline void __iomem * -+io_mapping_map_local_wc(struct io_mapping *mapping, unsigned long offset) -+{ -+ return io_mapping_map_wc(mapping, offset, PAGE_SIZE); -+} -+ -+static inline void io_mapping_unmap_local(void __iomem *vaddr) -+{ -+ io_mapping_unmap(vaddr); -+} -+ -+#endif /* !HAVE_ATOMIC_IOMAP */ - - static inline struct io_mapping * - io_mapping_create_wc(resource_size_t base, --- -2.30.2 - diff --git a/debian/patches-rt/0050-x86-crashdump-32-Simplify-copy_oldmem_page.patch b/debian/patches-rt/0050-x86-crashdump-32-Simplify-copy_oldmem_page.patch deleted file mode 100644 index 78fa4dafe..000000000 --- a/debian/patches-rt/0050-x86-crashdump-32-Simplify-copy_oldmem_page.patch +++ /dev/null @@ -1,99 +0,0 @@ -From 4c618bbc25e3016213e8eca6d66741b600567366 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:39 +0100 -Subject: [PATCH 050/296] x86/crashdump/32: Simplify copy_oldmem_page() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Replace kmap_atomic_pfn() with kmap_local_pfn() which is preemptible and -can take page faults. - -Remove the indirection of the dump page and the related cruft which is not -longer required. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/x86/kernel/crash_dump_32.c | 48 +++++++-------------------------- - 1 file changed, 10 insertions(+), 38 deletions(-) - -diff --git a/arch/x86/kernel/crash_dump_32.c b/arch/x86/kernel/crash_dump_32.c -index 33ee47670b99..5fcac46aaf6b 100644 ---- a/arch/x86/kernel/crash_dump_32.c -+++ b/arch/x86/kernel/crash_dump_32.c -@@ -13,8 +13,6 @@ - - #include <linux/uaccess.h> - --static void *kdump_buf_page; -- - static inline bool is_crashed_pfn_valid(unsigned long pfn) - { - #ifndef CONFIG_X86_PAE -@@ -41,15 +39,11 @@ static inline bool is_crashed_pfn_valid(unsigned long pfn) - * @userbuf: if set, @buf is in user address space, use copy_to_user(), - * otherwise @buf is in kernel address space, use memcpy(). - * -- * Copy a page from "oldmem". For this page, there is no pte mapped -- * in the current kernel. We stitch up a pte, similar to kmap_atomic. -- * -- * Calling copy_to_user() in atomic context is not desirable. Hence first -- * copying the data to a pre-allocated kernel page and then copying to user -- * space in non-atomic context. -+ * Copy a page from "oldmem". For this page, there might be no pte mapped -+ * in the current kernel. - */ --ssize_t copy_oldmem_page(unsigned long pfn, char *buf, -- size_t csize, unsigned long offset, int userbuf) -+ssize_t copy_oldmem_page(unsigned long pfn, char *buf, size_t csize, -+ unsigned long offset, int userbuf) - { - void *vaddr; - -@@ -59,38 +53,16 @@ ssize_t copy_oldmem_page(unsigned long pfn, char *buf, - if (!is_crashed_pfn_valid(pfn)) - return -EFAULT; - -- vaddr = kmap_atomic_pfn(pfn); -+ vaddr = kmap_local_pfn(pfn); - - if (!userbuf) { -- memcpy(buf, (vaddr + offset), csize); -- kunmap_atomic(vaddr); -+ memcpy(buf, vaddr + offset, csize); - } else { -- if (!kdump_buf_page) { -- printk(KERN_WARNING "Kdump: Kdump buffer page not" -- " allocated\n"); -- kunmap_atomic(vaddr); -- return -EFAULT; -- } -- copy_page(kdump_buf_page, vaddr); -- kunmap_atomic(vaddr); -- if (copy_to_user(buf, (kdump_buf_page + offset), csize)) -- return -EFAULT; -+ if (copy_to_user(buf, vaddr + offset, csize)) -+ csize = -EFAULT; - } - -- return csize; --} -+ kunmap_local(vaddr); - --static int __init kdump_buf_page_init(void) --{ -- int ret = 0; -- -- kdump_buf_page = kmalloc(PAGE_SIZE, GFP_KERNEL); -- if (!kdump_buf_page) { -- printk(KERN_WARNING "Kdump: Failed to allocate kdump buffer" -- " page\n"); -- ret = -ENOMEM; -- } -- -- return ret; -+ return csize; - } --arch_initcall(kdump_buf_page_init); --- -2.30.2 - diff --git a/debian/patches-rt/0051-mips-crashdump-Simplify-copy_oldmem_page.patch b/debian/patches-rt/0051-mips-crashdump-Simplify-copy_oldmem_page.patch deleted file mode 100644 index 5317d911a..000000000 --- a/debian/patches-rt/0051-mips-crashdump-Simplify-copy_oldmem_page.patch +++ /dev/null @@ -1,95 +0,0 @@ -From e8420c943ce30f0ae9ea82e09722a407017cc029 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:40 +0100 -Subject: [PATCH 051/296] mips/crashdump: Simplify copy_oldmem_page() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Replace kmap_atomic_pfn() with kmap_local_pfn() which is preemptible and -can take page faults. - -Remove the indirection of the dump page and the related cruft which is not -longer required. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> -Cc: linux-mips@vger.kernel.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/mips/kernel/crash_dump.c | 42 ++++++----------------------------- - 1 file changed, 7 insertions(+), 35 deletions(-) - -diff --git a/arch/mips/kernel/crash_dump.c b/arch/mips/kernel/crash_dump.c -index 01b2bd95ba1f..9aba83e1eeb4 100644 ---- a/arch/mips/kernel/crash_dump.c -+++ b/arch/mips/kernel/crash_dump.c -@@ -5,8 +5,6 @@ - #include <linux/uaccess.h> - #include <linux/slab.h> - --static void *kdump_buf_page; -- - /** - * copy_oldmem_page - copy one page from "oldmem" - * @pfn: page frame number to be copied -@@ -17,51 +15,25 @@ static void *kdump_buf_page; - * @userbuf: if set, @buf is in user address space, use copy_to_user(), - * otherwise @buf is in kernel address space, use memcpy(). - * -- * Copy a page from "oldmem". For this page, there is no pte mapped -+ * Copy a page from "oldmem". For this page, there might be no pte mapped - * in the current kernel. -- * -- * Calling copy_to_user() in atomic context is not desirable. Hence first -- * copying the data to a pre-allocated kernel page and then copying to user -- * space in non-atomic context. - */ --ssize_t copy_oldmem_page(unsigned long pfn, char *buf, -- size_t csize, unsigned long offset, int userbuf) -+ssize_t copy_oldmem_page(unsigned long pfn, char *buf, size_t csize, -+ unsigned long offset, int userbuf) - { - void *vaddr; - - if (!csize) - return 0; - -- vaddr = kmap_atomic_pfn(pfn); -+ vaddr = kmap_local_pfn(pfn); - - if (!userbuf) { -- memcpy(buf, (vaddr + offset), csize); -- kunmap_atomic(vaddr); -+ memcpy(buf, vaddr + offset, csize); - } else { -- if (!kdump_buf_page) { -- pr_warn("Kdump: Kdump buffer page not allocated\n"); -- -- return -EFAULT; -- } -- copy_page(kdump_buf_page, vaddr); -- kunmap_atomic(vaddr); -- if (copy_to_user(buf, (kdump_buf_page + offset), csize)) -- return -EFAULT; -+ if (copy_to_user(buf, vaddr + offset, csize)) -+ csize = -EFAULT; - } - - return csize; - } -- --static int __init kdump_buf_page_init(void) --{ -- int ret = 0; -- -- kdump_buf_page = kmalloc(PAGE_SIZE, GFP_KERNEL); -- if (!kdump_buf_page) { -- pr_warn("Kdump: Failed to allocate kdump buffer page\n"); -- ret = -ENOMEM; -- } -- -- return ret; --} --arch_initcall(kdump_buf_page_init); --- -2.30.2 - diff --git a/debian/patches-rt/0052-ARM-mm-Replace-kmap_atomic_pfn.patch b/debian/patches-rt/0052-ARM-mm-Replace-kmap_atomic_pfn.patch deleted file mode 100644 index c1127684f..000000000 --- a/debian/patches-rt/0052-ARM-mm-Replace-kmap_atomic_pfn.patch +++ /dev/null @@ -1,71 +0,0 @@ -From dc12c5718aed6bc057eb665ac119a51cc3aaf163 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:41 +0100 -Subject: [PATCH 052/296] ARM: mm: Replace kmap_atomic_pfn() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -There is no requirement to disable pagefaults and preemption for these -cache management mappings. - -Replace kmap_atomic_pfn() with kmap_local_pfn(). This allows to remove -kmap_atomic_pfn() in the next step. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Cc: Russell King <linux@armlinux.org.uk> -Cc: linux-arm-kernel@lists.infradead.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/arm/mm/cache-feroceon-l2.c | 6 +++--- - arch/arm/mm/cache-xsc3l2.c | 4 ++-- - 2 files changed, 5 insertions(+), 5 deletions(-) - -diff --git a/arch/arm/mm/cache-feroceon-l2.c b/arch/arm/mm/cache-feroceon-l2.c -index 5c1b7a7b9af6..87328766e910 100644 ---- a/arch/arm/mm/cache-feroceon-l2.c -+++ b/arch/arm/mm/cache-feroceon-l2.c -@@ -49,9 +49,9 @@ static inline unsigned long l2_get_va(unsigned long paddr) - * we simply install a virtual mapping for it only for the - * TLB lookup to occur, hence no need to flush the untouched - * memory mapping afterwards (note: a cache flush may happen -- * in some circumstances depending on the path taken in kunmap_atomic). -+ * in some circumstances depending on the path taken in kunmap_local). - */ -- void *vaddr = kmap_atomic_pfn(paddr >> PAGE_SHIFT); -+ void *vaddr = kmap_local_pfn(paddr >> PAGE_SHIFT); - return (unsigned long)vaddr + (paddr & ~PAGE_MASK); - #else - return __phys_to_virt(paddr); -@@ -61,7 +61,7 @@ static inline unsigned long l2_get_va(unsigned long paddr) - static inline void l2_put_va(unsigned long vaddr) - { - #ifdef CONFIG_HIGHMEM -- kunmap_atomic((void *)vaddr); -+ kunmap_local((void *)vaddr); - #endif - } - -diff --git a/arch/arm/mm/cache-xsc3l2.c b/arch/arm/mm/cache-xsc3l2.c -index d20d7af02d10..0e0a3abd8174 100644 ---- a/arch/arm/mm/cache-xsc3l2.c -+++ b/arch/arm/mm/cache-xsc3l2.c -@@ -59,7 +59,7 @@ static inline void l2_unmap_va(unsigned long va) - { - #ifdef CONFIG_HIGHMEM - if (va != -1) -- kunmap_atomic((void *)va); -+ kunmap_local((void *)va); - #endif - } - -@@ -75,7 +75,7 @@ static inline unsigned long l2_map_va(unsigned long pa, unsigned long prev_va) - * in place for it. - */ - l2_unmap_va(prev_va); -- va = (unsigned long)kmap_atomic_pfn(pa >> PAGE_SHIFT); -+ va = (unsigned long)kmap_local_pfn(pa >> PAGE_SHIFT); - } - return va + (pa_offset >> (32 - PAGE_SHIFT)); - #else --- -2.30.2 - diff --git a/debian/patches-rt/0053-highmem-Remove-kmap_atomic_pfn.patch b/debian/patches-rt/0053-highmem-Remove-kmap_atomic_pfn.patch deleted file mode 100644 index e237019df..000000000 --- a/debian/patches-rt/0053-highmem-Remove-kmap_atomic_pfn.patch +++ /dev/null @@ -1,47 +0,0 @@ -From 58b8f6fde8e7b10ca845b81e29bdb241da26ef7d Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:42 +0100 -Subject: [PATCH 053/296] highmem: Remove kmap_atomic_pfn() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -No more users. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/highmem-internal.h | 12 ------------ - 1 file changed, 12 deletions(-) - -diff --git a/include/linux/highmem-internal.h b/include/linux/highmem-internal.h -index 1bbe96dc8be6..3590af5aad96 100644 ---- a/include/linux/highmem-internal.h -+++ b/include/linux/highmem-internal.h -@@ -100,13 +100,6 @@ static inline void *kmap_atomic(struct page *page) - return kmap_atomic_prot(page, kmap_prot); - } - --static inline void *kmap_atomic_pfn(unsigned long pfn) --{ -- preempt_disable(); -- pagefault_disable(); -- return __kmap_local_pfn_prot(pfn, kmap_prot); --} -- - static inline void __kunmap_atomic(void *addr) - { - kunmap_local_indexed(addr); -@@ -194,11 +187,6 @@ static inline void *kmap_atomic_prot(struct page *page, pgprot_t prot) - return kmap_atomic(page); - } - --static inline void *kmap_atomic_pfn(unsigned long pfn) --{ -- return kmap_atomic(pfn_to_page(pfn)); --} -- - static inline void __kunmap_atomic(void *addr) - { - #ifdef ARCH_HAS_FLUSH_ON_KUNMAP --- -2.30.2 - diff --git a/debian/patches-rt/0054-drm-ttm-Replace-kmap_atomic-usage.patch b/debian/patches-rt/0054-drm-ttm-Replace-kmap_atomic-usage.patch deleted file mode 100644 index bc8d18c56..000000000 --- a/debian/patches-rt/0054-drm-ttm-Replace-kmap_atomic-usage.patch +++ /dev/null @@ -1,74 +0,0 @@ -From 22c796eefe906a2b7c50a97cf81e93b1ddbd116f Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:43 +0100 -Subject: [PATCH 054/296] drm/ttm: Replace kmap_atomic() usage -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -There is no reason to disable pagefaults and preemption as a side effect of -kmap_atomic_prot(). - -Use kmap_local_page_prot() instead and document the reasoning for the -mapping usage with the given pgprot. - -Remove the NULL pointer check for the map. These functions return a valid -address for valid pages and the return was bogus anyway as it would have -left preemption and pagefaults disabled. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Cc: Christian Koenig <christian.koenig@amd.com> -Cc: Huang Rui <ray.huang@amd.com> -Cc: David Airlie <airlied@linux.ie> -Cc: Daniel Vetter <daniel@ffwll.ch> -Cc: dri-devel@lists.freedesktop.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - drivers/gpu/drm/ttm/ttm_bo_util.c | 20 ++++++++++++-------- - 1 file changed, 12 insertions(+), 8 deletions(-) - -diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c -index fb2a25f8408f..164b9a015d32 100644 ---- a/drivers/gpu/drm/ttm/ttm_bo_util.c -+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c -@@ -181,13 +181,15 @@ static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void *src, - return -ENOMEM; - - src = (void *)((unsigned long)src + (page << PAGE_SHIFT)); -- dst = kmap_atomic_prot(d, prot); -- if (!dst) -- return -ENOMEM; -+ /* -+ * Ensure that a highmem page is mapped with the correct -+ * pgprot. For non highmem the mapping is already there. -+ */ -+ dst = kmap_local_page_prot(d, prot); - - memcpy_fromio(dst, src, PAGE_SIZE); - -- kunmap_atomic(dst); -+ kunmap_local(dst); - - return 0; - } -@@ -203,13 +205,15 @@ static int ttm_copy_ttm_io_page(struct ttm_tt *ttm, void *dst, - return -ENOMEM; - - dst = (void *)((unsigned long)dst + (page << PAGE_SHIFT)); -- src = kmap_atomic_prot(s, prot); -- if (!src) -- return -ENOMEM; -+ /* -+ * Ensure that a highmem page is mapped with the correct -+ * pgprot. For non highmem the mapping is already there. -+ */ -+ src = kmap_local_page_prot(s, prot); - - memcpy_toio(dst, src, PAGE_SIZE); - -- kunmap_atomic(src); -+ kunmap_local(src); - - return 0; - } --- -2.30.2 - diff --git a/debian/patches-rt/0055-drm-vmgfx-Replace-kmap_atomic.patch b/debian/patches-rt/0055-drm-vmgfx-Replace-kmap_atomic.patch deleted file mode 100644 index 6c343727d..000000000 --- a/debian/patches-rt/0055-drm-vmgfx-Replace-kmap_atomic.patch +++ /dev/null @@ -1,104 +0,0 @@ -From b00fba24dbce785bf73ab08e2eb4ebecc8fd4c0b Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:44 +0100 -Subject: [PATCH 055/296] drm/vmgfx: Replace kmap_atomic() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -There is no reason to disable pagefaults and preemption as a side effect of -kmap_atomic_prot(). - -Use kmap_local_page_prot() instead and document the reasoning for the -mapping usage with the given pgprot. - -Remove the NULL pointer check for the map. These functions return a valid -address for valid pages and the return was bogus anyway as it would have -left preemption and pagefaults disabled. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Cc: VMware Graphics <linux-graphics-maintainer@vmware.com> -Cc: Roland Scheidegger <sroland@vmware.com> -Cc: David Airlie <airlied@linux.ie> -Cc: Daniel Vetter <daniel@ffwll.ch> -Cc: dri-devel@lists.freedesktop.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - drivers/gpu/drm/vmwgfx/vmwgfx_blit.c | 30 +++++++++++----------------- - 1 file changed, 12 insertions(+), 18 deletions(-) - -diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c -index e8d66182cd7b..71dba228f68e 100644 ---- a/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c -+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c -@@ -375,12 +375,12 @@ static int vmw_bo_cpu_blit_line(struct vmw_bo_blit_line_data *d, - copy_size = min_t(u32, copy_size, PAGE_SIZE - src_page_offset); - - if (unmap_src) { -- kunmap_atomic(d->src_addr); -+ kunmap_local(d->src_addr); - d->src_addr = NULL; - } - - if (unmap_dst) { -- kunmap_atomic(d->dst_addr); -+ kunmap_local(d->dst_addr); - d->dst_addr = NULL; - } - -@@ -388,12 +388,8 @@ static int vmw_bo_cpu_blit_line(struct vmw_bo_blit_line_data *d, - if (WARN_ON_ONCE(dst_page >= d->dst_num_pages)) - return -EINVAL; - -- d->dst_addr = -- kmap_atomic_prot(d->dst_pages[dst_page], -- d->dst_prot); -- if (!d->dst_addr) -- return -ENOMEM; -- -+ d->dst_addr = kmap_local_page_prot(d->dst_pages[dst_page], -+ d->dst_prot); - d->mapped_dst = dst_page; - } - -@@ -401,12 +397,8 @@ static int vmw_bo_cpu_blit_line(struct vmw_bo_blit_line_data *d, - if (WARN_ON_ONCE(src_page >= d->src_num_pages)) - return -EINVAL; - -- d->src_addr = -- kmap_atomic_prot(d->src_pages[src_page], -- d->src_prot); -- if (!d->src_addr) -- return -ENOMEM; -- -+ d->src_addr = kmap_local_page_prot(d->src_pages[src_page], -+ d->src_prot); - d->mapped_src = src_page; - } - diff->do_cpy(diff, d->dst_addr + dst_page_offset, -@@ -436,8 +428,10 @@ static int vmw_bo_cpu_blit_line(struct vmw_bo_blit_line_data *d, - * - * Performs a CPU blit from one buffer object to another avoiding a full - * bo vmap which may exhaust- or fragment vmalloc space. -- * On supported architectures (x86), we're using kmap_atomic which avoids -- * cross-processor TLB- and cache flushes and may, on non-HIGHMEM systems -+ * -+ * On supported architectures (x86), we're using kmap_local_prot() which -+ * avoids cross-processor TLB- and cache flushes. kmap_local_prot() will -+ * either map a highmem page with the proper pgprot on HIGHMEM=y systems or - * reference already set-up mappings. - * - * Neither of the buffer objects may be placed in PCI memory -@@ -500,9 +494,9 @@ int vmw_bo_cpu_blit(struct ttm_buffer_object *dst, - } - out: - if (d.src_addr) -- kunmap_atomic(d.src_addr); -+ kunmap_local(d.src_addr); - if (d.dst_addr) -- kunmap_atomic(d.dst_addr); -+ kunmap_local(d.dst_addr); - - return ret; - } --- -2.30.2 - diff --git a/debian/patches-rt/0056-highmem-Remove-kmap_atomic_prot.patch b/debian/patches-rt/0056-highmem-Remove-kmap_atomic_prot.patch deleted file mode 100644 index a1c32e180..000000000 --- a/debian/patches-rt/0056-highmem-Remove-kmap_atomic_prot.patch +++ /dev/null @@ -1,52 +0,0 @@ -From 3fa5b7afb1f3604a040a3ac7f85020c20571d1f7 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:45 +0100 -Subject: [PATCH 056/296] highmem: Remove kmap_atomic_prot() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -No more users. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/highmem-internal.h | 14 ++------------ - 1 file changed, 2 insertions(+), 12 deletions(-) - -diff --git a/include/linux/highmem-internal.h b/include/linux/highmem-internal.h -index 3590af5aad96..bd15bf9164c2 100644 ---- a/include/linux/highmem-internal.h -+++ b/include/linux/highmem-internal.h -@@ -88,16 +88,11 @@ static inline void __kunmap_local(void *vaddr) - kunmap_local_indexed(vaddr); - } - --static inline void *kmap_atomic_prot(struct page *page, pgprot_t prot) -+static inline void *kmap_atomic(struct page *page) - { - preempt_disable(); - pagefault_disable(); -- return __kmap_local_page_prot(page, prot); --} -- --static inline void *kmap_atomic(struct page *page) --{ -- return kmap_atomic_prot(page, kmap_prot); -+ return __kmap_local_page_prot(page, kmap_prot); - } - - static inline void __kunmap_atomic(void *addr) -@@ -182,11 +177,6 @@ static inline void *kmap_atomic(struct page *page) - return page_address(page); - } - --static inline void *kmap_atomic_prot(struct page *page, pgprot_t prot) --{ -- return kmap_atomic(page); --} -- - static inline void __kunmap_atomic(void *addr) - { - #ifdef ARCH_HAS_FLUSH_ON_KUNMAP --- -2.30.2 - diff --git a/debian/patches-rt/0057-drm-qxl-Replace-io_mapping_map_atomic_wc.patch b/debian/patches-rt/0057-drm-qxl-Replace-io_mapping_map_atomic_wc.patch deleted file mode 100644 index b61bb2c44..000000000 --- a/debian/patches-rt/0057-drm-qxl-Replace-io_mapping_map_atomic_wc.patch +++ /dev/null @@ -1,257 +0,0 @@ -From 98b4066ba0c69ee936fbd3c70cafc560a6917c53 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:46 +0100 -Subject: [PATCH 057/296] drm/qxl: Replace io_mapping_map_atomic_wc() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -None of these mapping requires the side effect of disabling pagefaults and -preemption. - -Use io_mapping_map_local_wc() instead, rename the related functions -accordingly and clean up qxl_process_single_command() to use a plain -copy_from_user() as the local maps are not disabling pagefaults. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Cc: Dave Airlie <airlied@redhat.com> -Cc: Gerd Hoffmann <kraxel@redhat.com> -Cc: David Airlie <airlied@linux.ie> -Cc: Daniel Vetter <daniel@ffwll.ch> -Cc: virtualization@lists.linux-foundation.org -Cc: spice-devel@lists.freedesktop.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - drivers/gpu/drm/qxl/qxl_image.c | 18 +++++++++--------- - drivers/gpu/drm/qxl/qxl_ioctl.c | 27 +++++++++++++-------------- - drivers/gpu/drm/qxl/qxl_object.c | 12 ++++++------ - drivers/gpu/drm/qxl/qxl_object.h | 4 ++-- - drivers/gpu/drm/qxl/qxl_release.c | 4 ++-- - 5 files changed, 32 insertions(+), 33 deletions(-) - -diff --git a/drivers/gpu/drm/qxl/qxl_image.c b/drivers/gpu/drm/qxl/qxl_image.c -index 60ab7151b84d..93f92ccd42e5 100644 ---- a/drivers/gpu/drm/qxl/qxl_image.c -+++ b/drivers/gpu/drm/qxl/qxl_image.c -@@ -124,12 +124,12 @@ qxl_image_init_helper(struct qxl_device *qdev, - wrong (check the bitmaps are sent correctly - first) */ - -- ptr = qxl_bo_kmap_atomic_page(qdev, chunk_bo, 0); -+ ptr = qxl_bo_kmap_local_page(qdev, chunk_bo, 0); - chunk = ptr; - chunk->data_size = height * chunk_stride; - chunk->prev_chunk = 0; - chunk->next_chunk = 0; -- qxl_bo_kunmap_atomic_page(qdev, chunk_bo, ptr); -+ qxl_bo_kunmap_local_page(qdev, chunk_bo, ptr); - - { - void *k_data, *i_data; -@@ -143,7 +143,7 @@ qxl_image_init_helper(struct qxl_device *qdev, - i_data = (void *)data; - - while (remain > 0) { -- ptr = qxl_bo_kmap_atomic_page(qdev, chunk_bo, page << PAGE_SHIFT); -+ ptr = qxl_bo_kmap_local_page(qdev, chunk_bo, page << PAGE_SHIFT); - - if (page == 0) { - chunk = ptr; -@@ -157,7 +157,7 @@ qxl_image_init_helper(struct qxl_device *qdev, - - memcpy(k_data, i_data, size); - -- qxl_bo_kunmap_atomic_page(qdev, chunk_bo, ptr); -+ qxl_bo_kunmap_local_page(qdev, chunk_bo, ptr); - i_data += size; - remain -= size; - page++; -@@ -175,10 +175,10 @@ qxl_image_init_helper(struct qxl_device *qdev, - page_offset = offset_in_page(out_offset); - size = min((int)(PAGE_SIZE - page_offset), remain); - -- ptr = qxl_bo_kmap_atomic_page(qdev, chunk_bo, page_base); -+ ptr = qxl_bo_kmap_local_page(qdev, chunk_bo, page_base); - k_data = ptr + page_offset; - memcpy(k_data, i_data, size); -- qxl_bo_kunmap_atomic_page(qdev, chunk_bo, ptr); -+ qxl_bo_kunmap_local_page(qdev, chunk_bo, ptr); - remain -= size; - i_data += size; - out_offset += size; -@@ -189,7 +189,7 @@ qxl_image_init_helper(struct qxl_device *qdev, - qxl_bo_kunmap(chunk_bo); - - image_bo = dimage->bo; -- ptr = qxl_bo_kmap_atomic_page(qdev, image_bo, 0); -+ ptr = qxl_bo_kmap_local_page(qdev, image_bo, 0); - image = ptr; - - image->descriptor.id = 0; -@@ -212,7 +212,7 @@ qxl_image_init_helper(struct qxl_device *qdev, - break; - default: - DRM_ERROR("unsupported image bit depth\n"); -- qxl_bo_kunmap_atomic_page(qdev, image_bo, ptr); -+ qxl_bo_kunmap_local_page(qdev, image_bo, ptr); - return -EINVAL; - } - image->u.bitmap.flags = QXL_BITMAP_TOP_DOWN; -@@ -222,7 +222,7 @@ qxl_image_init_helper(struct qxl_device *qdev, - image->u.bitmap.palette = 0; - image->u.bitmap.data = qxl_bo_physical_address(qdev, chunk_bo, 0); - -- qxl_bo_kunmap_atomic_page(qdev, image_bo, ptr); -+ qxl_bo_kunmap_local_page(qdev, image_bo, ptr); - - return 0; - } -diff --git a/drivers/gpu/drm/qxl/qxl_ioctl.c b/drivers/gpu/drm/qxl/qxl_ioctl.c -index 5cea6eea72ab..785023081b79 100644 ---- a/drivers/gpu/drm/qxl/qxl_ioctl.c -+++ b/drivers/gpu/drm/qxl/qxl_ioctl.c -@@ -89,11 +89,11 @@ apply_reloc(struct qxl_device *qdev, struct qxl_reloc_info *info) - { - void *reloc_page; - -- reloc_page = qxl_bo_kmap_atomic_page(qdev, info->dst_bo, info->dst_offset & PAGE_MASK); -+ reloc_page = qxl_bo_kmap_local_page(qdev, info->dst_bo, info->dst_offset & PAGE_MASK); - *(uint64_t *)(reloc_page + (info->dst_offset & ~PAGE_MASK)) = qxl_bo_physical_address(qdev, - info->src_bo, - info->src_offset); -- qxl_bo_kunmap_atomic_page(qdev, info->dst_bo, reloc_page); -+ qxl_bo_kunmap_local_page(qdev, info->dst_bo, reloc_page); - } - - static void -@@ -105,9 +105,9 @@ apply_surf_reloc(struct qxl_device *qdev, struct qxl_reloc_info *info) - if (info->src_bo && !info->src_bo->is_primary) - id = info->src_bo->surface_id; - -- reloc_page = qxl_bo_kmap_atomic_page(qdev, info->dst_bo, info->dst_offset & PAGE_MASK); -+ reloc_page = qxl_bo_kmap_local_page(qdev, info->dst_bo, info->dst_offset & PAGE_MASK); - *(uint32_t *)(reloc_page + (info->dst_offset & ~PAGE_MASK)) = id; -- qxl_bo_kunmap_atomic_page(qdev, info->dst_bo, reloc_page); -+ qxl_bo_kunmap_local_page(qdev, info->dst_bo, reloc_page); - } - - /* return holding the reference to this object */ -@@ -149,7 +149,6 @@ static int qxl_process_single_command(struct qxl_device *qdev, - struct qxl_bo *cmd_bo; - void *fb_cmd; - int i, ret, num_relocs; -- int unwritten; - - switch (cmd->type) { - case QXL_CMD_DRAW: -@@ -185,21 +184,21 @@ static int qxl_process_single_command(struct qxl_device *qdev, - goto out_free_reloc; - - /* TODO copy slow path code from i915 */ -- fb_cmd = qxl_bo_kmap_atomic_page(qdev, cmd_bo, (release->release_offset & PAGE_MASK)); -- unwritten = __copy_from_user_inatomic_nocache -- (fb_cmd + sizeof(union qxl_release_info) + (release->release_offset & ~PAGE_MASK), -- u64_to_user_ptr(cmd->command), cmd->command_size); -+ fb_cmd = qxl_bo_kmap_local_page(qdev, cmd_bo, (release->release_offset & PAGE_MASK)); - -- { -+ if (copy_from_user(fb_cmd + sizeof(union qxl_release_info) + -+ (release->release_offset & ~PAGE_MASK), -+ u64_to_user_ptr(cmd->command), cmd->command_size)) { -+ ret = -EFAULT; -+ } else { - struct qxl_drawable *draw = fb_cmd; - - draw->mm_time = qdev->rom->mm_clock; - } - -- qxl_bo_kunmap_atomic_page(qdev, cmd_bo, fb_cmd); -- if (unwritten) { -- DRM_ERROR("got unwritten %d\n", unwritten); -- ret = -EFAULT; -+ qxl_bo_kunmap_local_page(qdev, cmd_bo, fb_cmd); -+ if (ret) { -+ DRM_ERROR("copy from user failed %d\n", ret); - goto out_free_release; - } - -diff --git a/drivers/gpu/drm/qxl/qxl_object.c b/drivers/gpu/drm/qxl/qxl_object.c -index 2bc364412e8b..9350d238ba54 100644 ---- a/drivers/gpu/drm/qxl/qxl_object.c -+++ b/drivers/gpu/drm/qxl/qxl_object.c -@@ -172,8 +172,8 @@ int qxl_bo_kmap(struct qxl_bo *bo, void **ptr) - return 0; - } - --void *qxl_bo_kmap_atomic_page(struct qxl_device *qdev, -- struct qxl_bo *bo, int page_offset) -+void *qxl_bo_kmap_local_page(struct qxl_device *qdev, -+ struct qxl_bo *bo, int page_offset) - { - unsigned long offset; - void *rptr; -@@ -188,7 +188,7 @@ void *qxl_bo_kmap_atomic_page(struct qxl_device *qdev, - goto fallback; - - offset = bo->tbo.mem.start << PAGE_SHIFT; -- return io_mapping_map_atomic_wc(map, offset + page_offset); -+ return io_mapping_map_local_wc(map, offset + page_offset); - fallback: - if (bo->kptr) { - rptr = bo->kptr + (page_offset * PAGE_SIZE); -@@ -214,14 +214,14 @@ void qxl_bo_kunmap(struct qxl_bo *bo) - ttm_bo_kunmap(&bo->kmap); - } - --void qxl_bo_kunmap_atomic_page(struct qxl_device *qdev, -- struct qxl_bo *bo, void *pmap) -+void qxl_bo_kunmap_local_page(struct qxl_device *qdev, -+ struct qxl_bo *bo, void *pmap) - { - if ((bo->tbo.mem.mem_type != TTM_PL_VRAM) && - (bo->tbo.mem.mem_type != TTM_PL_PRIV)) - goto fallback; - -- io_mapping_unmap_atomic(pmap); -+ io_mapping_unmap_local(pmap); - return; - fallback: - qxl_bo_kunmap(bo); -diff --git a/drivers/gpu/drm/qxl/qxl_object.h b/drivers/gpu/drm/qxl/qxl_object.h -index 6b434e5ef795..02f1e0374228 100644 ---- a/drivers/gpu/drm/qxl/qxl_object.h -+++ b/drivers/gpu/drm/qxl/qxl_object.h -@@ -88,8 +88,8 @@ extern int qxl_bo_create(struct qxl_device *qdev, - struct qxl_bo **bo_ptr); - extern int qxl_bo_kmap(struct qxl_bo *bo, void **ptr); - extern void qxl_bo_kunmap(struct qxl_bo *bo); --void *qxl_bo_kmap_atomic_page(struct qxl_device *qdev, struct qxl_bo *bo, int page_offset); --void qxl_bo_kunmap_atomic_page(struct qxl_device *qdev, struct qxl_bo *bo, void *map); -+void *qxl_bo_kmap_local_page(struct qxl_device *qdev, struct qxl_bo *bo, int page_offset); -+void qxl_bo_kunmap_local_page(struct qxl_device *qdev, struct qxl_bo *bo, void *map); - extern struct qxl_bo *qxl_bo_ref(struct qxl_bo *bo); - extern void qxl_bo_unref(struct qxl_bo **bo); - extern int qxl_bo_pin(struct qxl_bo *bo); -diff --git a/drivers/gpu/drm/qxl/qxl_release.c b/drivers/gpu/drm/qxl/qxl_release.c -index 4fae3e393da1..9f37b51e61c6 100644 ---- a/drivers/gpu/drm/qxl/qxl_release.c -+++ b/drivers/gpu/drm/qxl/qxl_release.c -@@ -408,7 +408,7 @@ union qxl_release_info *qxl_release_map(struct qxl_device *qdev, - union qxl_release_info *info; - struct qxl_bo *bo = release->release_bo; - -- ptr = qxl_bo_kmap_atomic_page(qdev, bo, release->release_offset & PAGE_MASK); -+ ptr = qxl_bo_kmap_local_page(qdev, bo, release->release_offset & PAGE_MASK); - if (!ptr) - return NULL; - info = ptr + (release->release_offset & ~PAGE_MASK); -@@ -423,7 +423,7 @@ void qxl_release_unmap(struct qxl_device *qdev, - void *ptr; - - ptr = ((void *)info) - (release->release_offset & ~PAGE_MASK); -- qxl_bo_kunmap_atomic_page(qdev, bo, ptr); -+ qxl_bo_kunmap_local_page(qdev, bo, ptr); - } - - void qxl_release_fence_buffer_objects(struct qxl_release *release) --- -2.30.2 - diff --git a/debian/patches-rt/0058-drm-nouveau-device-Replace-io_mapping_map_atomic_wc.patch b/debian/patches-rt/0058-drm-nouveau-device-Replace-io_mapping_map_atomic_wc.patch deleted file mode 100644 index 9ce86949d..000000000 --- a/debian/patches-rt/0058-drm-nouveau-device-Replace-io_mapping_map_atomic_wc.patch +++ /dev/null @@ -1,54 +0,0 @@ -From 9aed7aecaf6379d77f9fdf99a7c1214fc07d3922 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:47 +0100 -Subject: [PATCH 058/296] drm/nouveau/device: Replace - io_mapping_map_atomic_wc() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Neither fbmem_peek() nor fbmem_poke() require to disable pagefaults and -preemption as a side effect of io_mapping_map_atomic_wc(). - -Use io_mapping_map_local_wc() instead. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Cc: Ben Skeggs <bskeggs@redhat.com> -Cc: David Airlie <airlied@linux.ie> -Cc: Daniel Vetter <daniel@ffwll.ch> -Cc: dri-devel@lists.freedesktop.org -Cc: nouveau@lists.freedesktop.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - drivers/gpu/drm/nouveau/nvkm/subdev/devinit/fbmem.h | 8 ++++---- - 1 file changed, 4 insertions(+), 4 deletions(-) - -diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/fbmem.h b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/fbmem.h -index 6c5bbff12eb4..411f91ee20fa 100644 ---- a/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/fbmem.h -+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/devinit/fbmem.h -@@ -60,19 +60,19 @@ fbmem_fini(struct io_mapping *fb) - static inline u32 - fbmem_peek(struct io_mapping *fb, u32 off) - { -- u8 __iomem *p = io_mapping_map_atomic_wc(fb, off & PAGE_MASK); -+ u8 __iomem *p = io_mapping_map_local_wc(fb, off & PAGE_MASK); - u32 val = ioread32(p + (off & ~PAGE_MASK)); -- io_mapping_unmap_atomic(p); -+ io_mapping_unmap_local(p); - return val; - } - - static inline void - fbmem_poke(struct io_mapping *fb, u32 off, u32 val) - { -- u8 __iomem *p = io_mapping_map_atomic_wc(fb, off & PAGE_MASK); -+ u8 __iomem *p = io_mapping_map_local_wc(fb, off & PAGE_MASK); - iowrite32(val, p + (off & ~PAGE_MASK)); - wmb(); -- io_mapping_unmap_atomic(p); -+ io_mapping_unmap_local(p); - } - - static inline bool --- -2.30.2 - diff --git a/debian/patches-rt/0059-drm-i915-Replace-io_mapping_map_atomic_wc.patch b/debian/patches-rt/0059-drm-i915-Replace-io_mapping_map_atomic_wc.patch deleted file mode 100644 index 2a2c09d89..000000000 --- a/debian/patches-rt/0059-drm-i915-Replace-io_mapping_map_atomic_wc.patch +++ /dev/null @@ -1,173 +0,0 @@ -From c68ee383387888c5fa2aa408e01fb56e6bad55a4 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:48 +0100 -Subject: [PATCH 059/296] drm/i915: Replace io_mapping_map_atomic_wc() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -None of these mapping requires the side effect of disabling pagefaults and -preemption. - -Use io_mapping_map_local_wc() instead, and clean up gtt_user_read() and -gtt_user_write() to use a plain copy_from_user() as the local maps are not -disabling pagefaults. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Cc: Jani Nikula <jani.nikula@linux.intel.com> -Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> -Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> -Cc: David Airlie <airlied@linux.ie> -Cc: Daniel Vetter <daniel@ffwll.ch> -Cc: intel-gfx@lists.freedesktop.org -Cc: dri-devel@lists.freedesktop.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 7 ++-- - drivers/gpu/drm/i915/i915_gem.c | 40 ++++++------------- - drivers/gpu/drm/i915/selftests/i915_gem.c | 4 +- - drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 8 ++-- - 4 files changed, 22 insertions(+), 37 deletions(-) - -diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c -index bd3046e5a934..1850f13e9d19 100644 ---- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c -+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c -@@ -1081,7 +1081,7 @@ static void reloc_cache_reset(struct reloc_cache *cache, struct i915_execbuffer - struct i915_ggtt *ggtt = cache_to_ggtt(cache); - - intel_gt_flush_ggtt_writes(ggtt->vm.gt); -- io_mapping_unmap_atomic((void __iomem *)vaddr); -+ io_mapping_unmap_local((void __iomem *)vaddr); - - if (drm_mm_node_allocated(&cache->node)) { - ggtt->vm.clear_range(&ggtt->vm, -@@ -1147,7 +1147,7 @@ static void *reloc_iomap(struct drm_i915_gem_object *obj, - - if (cache->vaddr) { - intel_gt_flush_ggtt_writes(ggtt->vm.gt); -- io_mapping_unmap_atomic((void __force __iomem *) unmask_page(cache->vaddr)); -+ io_mapping_unmap_local((void __force __iomem *) unmask_page(cache->vaddr)); - } else { - struct i915_vma *vma; - int err; -@@ -1195,8 +1195,7 @@ static void *reloc_iomap(struct drm_i915_gem_object *obj, - offset += page << PAGE_SHIFT; - } - -- vaddr = (void __force *)io_mapping_map_atomic_wc(&ggtt->iomap, -- offset); -+ vaddr = (void __force *)io_mapping_map_local_wc(&ggtt->iomap, offset); - cache->page = page; - cache->vaddr = (unsigned long)vaddr; - -diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c -index 58276694c848..88944c3b1bc8 100644 ---- a/drivers/gpu/drm/i915/i915_gem.c -+++ b/drivers/gpu/drm/i915/i915_gem.c -@@ -355,22 +355,15 @@ gtt_user_read(struct io_mapping *mapping, - char __user *user_data, int length) - { - void __iomem *vaddr; -- unsigned long unwritten; -+ bool fail = false; - - /* We can use the cpu mem copy function because this is X86. */ -- vaddr = io_mapping_map_atomic_wc(mapping, base); -- unwritten = __copy_to_user_inatomic(user_data, -- (void __force *)vaddr + offset, -- length); -- io_mapping_unmap_atomic(vaddr); -- if (unwritten) { -- vaddr = io_mapping_map_wc(mapping, base, PAGE_SIZE); -- unwritten = copy_to_user(user_data, -- (void __force *)vaddr + offset, -- length); -- io_mapping_unmap(vaddr); -- } -- return unwritten; -+ vaddr = io_mapping_map_local_wc(mapping, base); -+ if (copy_to_user(user_data, (void __force *)vaddr + offset, length)) -+ fail = true; -+ io_mapping_unmap_local(vaddr); -+ -+ return fail; - } - - static int -@@ -539,21 +532,14 @@ ggtt_write(struct io_mapping *mapping, - char __user *user_data, int length) - { - void __iomem *vaddr; -- unsigned long unwritten; -+ bool fail = false; - - /* We can use the cpu mem copy function because this is X86. */ -- vaddr = io_mapping_map_atomic_wc(mapping, base); -- unwritten = __copy_from_user_inatomic_nocache((void __force *)vaddr + offset, -- user_data, length); -- io_mapping_unmap_atomic(vaddr); -- if (unwritten) { -- vaddr = io_mapping_map_wc(mapping, base, PAGE_SIZE); -- unwritten = copy_from_user((void __force *)vaddr + offset, -- user_data, length); -- io_mapping_unmap(vaddr); -- } -- -- return unwritten; -+ vaddr = io_mapping_map_local_wc(mapping, base); -+ if (copy_from_user((void __force *)vaddr + offset, user_data, length)) -+ fail = true; -+ io_mapping_unmap_local(vaddr); -+ return fail; - } - - /** -diff --git a/drivers/gpu/drm/i915/selftests/i915_gem.c b/drivers/gpu/drm/i915/selftests/i915_gem.c -index 412e21604a05..432493183d20 100644 ---- a/drivers/gpu/drm/i915/selftests/i915_gem.c -+++ b/drivers/gpu/drm/i915/selftests/i915_gem.c -@@ -57,12 +57,12 @@ static void trash_stolen(struct drm_i915_private *i915) - - ggtt->vm.insert_page(&ggtt->vm, dma, slot, I915_CACHE_NONE, 0); - -- s = io_mapping_map_atomic_wc(&ggtt->iomap, slot); -+ s = io_mapping_map_local_wc(&ggtt->iomap, slot); - for (x = 0; x < PAGE_SIZE / sizeof(u32); x++) { - prng = next_pseudo_random32(prng); - iowrite32(prng, &s[x]); - } -- io_mapping_unmap_atomic(s); -+ io_mapping_unmap_local(s); - } - - ggtt->vm.clear_range(&ggtt->vm, slot, PAGE_SIZE); -diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c -index 713770fb2b92..a68bed4e6b64 100644 ---- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c -+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c -@@ -1200,9 +1200,9 @@ static int igt_ggtt_page(void *arg) - u64 offset = tmp.start + order[n] * PAGE_SIZE; - u32 __iomem *vaddr; - -- vaddr = io_mapping_map_atomic_wc(&ggtt->iomap, offset); -+ vaddr = io_mapping_map_local_wc(&ggtt->iomap, offset); - iowrite32(n, vaddr + n); -- io_mapping_unmap_atomic(vaddr); -+ io_mapping_unmap_local(vaddr); - } - intel_gt_flush_ggtt_writes(ggtt->vm.gt); - -@@ -1212,9 +1212,9 @@ static int igt_ggtt_page(void *arg) - u32 __iomem *vaddr; - u32 val; - -- vaddr = io_mapping_map_atomic_wc(&ggtt->iomap, offset); -+ vaddr = io_mapping_map_local_wc(&ggtt->iomap, offset); - val = ioread32(vaddr + n); -- io_mapping_unmap_atomic(vaddr); -+ io_mapping_unmap_local(vaddr); - - if (val != n) { - pr_err("insert page failed: found %d, expected %d\n", --- -2.30.2 - diff --git a/debian/patches-rt/0060-io-mapping-Remove-io_mapping_map_atomic_wc.patch b/debian/patches-rt/0060-io-mapping-Remove-io_mapping_map_atomic_wc.patch deleted file mode 100644 index 0cd36bc83..000000000 --- a/debian/patches-rt/0060-io-mapping-Remove-io_mapping_map_atomic_wc.patch +++ /dev/null @@ -1,140 +0,0 @@ -From a569d0cea61d373336d17263b9654d564cc3cc0e Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 3 Nov 2020 10:27:49 +0100 -Subject: [PATCH 060/296] io-mapping: Remove io_mapping_map_atomic_wc() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -No more users. Get rid of it and remove the traces in documentation. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - Documentation/driver-api/io-mapping.rst | 22 +++++-------- - include/linux/io-mapping.h | 42 ++----------------------- - 2 files changed, 9 insertions(+), 55 deletions(-) - -diff --git a/Documentation/driver-api/io-mapping.rst b/Documentation/driver-api/io-mapping.rst -index a0cfb15988df..a7830c59481f 100644 ---- a/Documentation/driver-api/io-mapping.rst -+++ b/Documentation/driver-api/io-mapping.rst -@@ -21,19 +21,15 @@ mappable, while 'size' indicates how large a mapping region to - enable. Both are in bytes. - - This _wc variant provides a mapping which may only be used with --io_mapping_map_atomic_wc(), io_mapping_map_local_wc() or --io_mapping_map_wc(). -+io_mapping_map_local_wc() or io_mapping_map_wc(). - - With this mapping object, individual pages can be mapped either temporarily - or long term, depending on the requirements. Of course, temporary maps are --more efficient. They come in two flavours:: -+more efficient. - - void *io_mapping_map_local_wc(struct io_mapping *mapping, - unsigned long offset) - -- void *io_mapping_map_atomic_wc(struct io_mapping *mapping, -- unsigned long offset) -- - 'offset' is the offset within the defined mapping region. Accessing - addresses beyond the region specified in the creation function yields - undefined results. Using an offset which is not page aligned yields an -@@ -50,9 +46,6 @@ io_mapping_map_local_wc() has a side effect on X86 32bit as it disables - migration to make the mapping code work. No caller can rely on this side - effect. - --io_mapping_map_atomic_wc() has the side effect of disabling preemption and --pagefaults. Don't use in new code. Use io_mapping_map_local_wc() instead. -- - Nested mappings need to be undone in reverse order because the mapping - code uses a stack for keeping track of them:: - -@@ -65,11 +58,10 @@ code uses a stack for keeping track of them:: - The mappings are released with:: - - void io_mapping_unmap_local(void *vaddr) -- void io_mapping_unmap_atomic(void *vaddr) - --'vaddr' must be the value returned by the last io_mapping_map_local_wc() or --io_mapping_map_atomic_wc() call. This unmaps the specified mapping and --undoes the side effects of the mapping functions. -+'vaddr' must be the value returned by the last io_mapping_map_local_wc() -+call. This unmaps the specified mapping and undoes eventual side effects of -+the mapping function. - - If you need to sleep while holding a mapping, you can use the regular - variant, although this may be significantly slower:: -@@ -77,8 +69,8 @@ variant, although this may be significantly slower:: - void *io_mapping_map_wc(struct io_mapping *mapping, - unsigned long offset) - --This works like io_mapping_map_atomic/local_wc() except it has no side --effects and the pointer is globaly visible. -+This works like io_mapping_map_local_wc() except it has no side effects and -+the pointer is globaly visible. - - The mappings are released with:: - -diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h -index c093e81310a9..4bb8223f2f82 100644 ---- a/include/linux/io-mapping.h -+++ b/include/linux/io-mapping.h -@@ -60,28 +60,7 @@ io_mapping_fini(struct io_mapping *mapping) - iomap_free(mapping->base, mapping->size); - } - --/* Atomic map/unmap */ --static inline void __iomem * --io_mapping_map_atomic_wc(struct io_mapping *mapping, -- unsigned long offset) --{ -- resource_size_t phys_addr; -- -- BUG_ON(offset >= mapping->size); -- phys_addr = mapping->base + offset; -- preempt_disable(); -- pagefault_disable(); -- return __iomap_local_pfn_prot(PHYS_PFN(phys_addr), mapping->prot); --} -- --static inline void --io_mapping_unmap_atomic(void __iomem *vaddr) --{ -- kunmap_local_indexed((void __force *)vaddr); -- pagefault_enable(); -- preempt_enable(); --} -- -+/* Temporary mappings which are only valid in the current context */ - static inline void __iomem * - io_mapping_map_local_wc(struct io_mapping *mapping, unsigned long offset) - { -@@ -163,24 +142,7 @@ io_mapping_unmap(void __iomem *vaddr) - { - } - --/* Atomic map/unmap */ --static inline void __iomem * --io_mapping_map_atomic_wc(struct io_mapping *mapping, -- unsigned long offset) --{ -- preempt_disable(); -- pagefault_disable(); -- return io_mapping_map_wc(mapping, offset, PAGE_SIZE); --} -- --static inline void --io_mapping_unmap_atomic(void __iomem *vaddr) --{ -- io_mapping_unmap(vaddr); -- pagefault_enable(); -- preempt_enable(); --} -- -+/* Temporary mappings which are only valid in the current context */ - static inline void __iomem * - io_mapping_map_local_wc(struct io_mapping *mapping, unsigned long offset) - { --- -2.30.2 - diff --git a/debian/patches-rt/0061-mm-highmem-Take-kmap_high_get-properly-into-account.patch b/debian/patches-rt/0061-mm-highmem-Take-kmap_high_get-properly-into-account.patch deleted file mode 100644 index 24fc264c6..000000000 --- a/debian/patches-rt/0061-mm-highmem-Take-kmap_high_get-properly-into-account.patch +++ /dev/null @@ -1,72 +0,0 @@ -From ae5763668200f801524fc9d8976093ceb461df92 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Thu, 12 Nov 2020 11:59:32 +0100 -Subject: [PATCH 061/296] mm/highmem: Take kmap_high_get() properly into - account -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -kunmap_local() warns when the virtual address to unmap is below -PAGE_OFFSET. This is correct except for the case that the mapping was -obtained via kmap_high_get() because the PKMAP addresses are right below -PAGE_OFFSET. - -Cure it by skipping the WARN_ON() when the unmap was handled by -kunmap_high(). - -Fixes: 298fa1ad5571 ("highmem: Provide generic variant of kmap_atomic*") -Reported-by: vtolkm@googlemail.com -Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> -Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Cc: Andrew Morton <akpm@linux-foundation.org> -Link: https://lore.kernel.org/r/87y2j6n8mj.fsf@nanos.tec.linutronix.de -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - mm/highmem.c | 19 +++++++++++++------ - 1 file changed, 13 insertions(+), 6 deletions(-) - -diff --git a/mm/highmem.c b/mm/highmem.c -index 8db577e5290c..72b9a2d95c72 100644 ---- a/mm/highmem.c -+++ b/mm/highmem.c -@@ -422,12 +422,15 @@ static inline void *arch_kmap_local_high_get(struct page *page) - #endif - - /* Unmap a local mapping which was obtained by kmap_high_get() */ --static inline void kmap_high_unmap_local(unsigned long vaddr) -+static inline bool kmap_high_unmap_local(unsigned long vaddr) - { - #ifdef ARCH_NEEDS_KMAP_HIGH_GET -- if (vaddr >= PKMAP_ADDR(0) && vaddr < PKMAP_ADDR(LAST_PKMAP)) -+ if (vaddr >= PKMAP_ADDR(0) && vaddr < PKMAP_ADDR(LAST_PKMAP)) { - kunmap_high(pte_page(pkmap_page_table[PKMAP_NR(vaddr)])); -+ return true; -+ } - #endif -+ return false; - } - - static inline int kmap_local_calc_idx(int idx) -@@ -493,10 +496,14 @@ void kunmap_local_indexed(void *vaddr) - - if (addr < __fix_to_virt(FIX_KMAP_END) || - addr > __fix_to_virt(FIX_KMAP_BEGIN)) { -- WARN_ON_ONCE(addr < PAGE_OFFSET); -- -- /* Handle mappings which were obtained by kmap_high_get() */ -- kmap_high_unmap_local(addr); -+ /* -+ * Handle mappings which were obtained by kmap_high_get() -+ * first as the virtual address of such mappings is below -+ * PAGE_OFFSET. Warn for all other addresses which are in -+ * the user space part of the virtual address space. -+ */ -+ if (!kmap_high_unmap_local(addr)) -+ WARN_ON_ONCE(addr < PAGE_OFFSET); - return; - } - --- -2.30.2 - diff --git a/debian/patches-rt/0062-highmem-Don-t-disable-preemption-on-RT-in-kmap_atomi.patch b/debian/patches-rt/0062-highmem-Don-t-disable-preemption-on-RT-in-kmap_atomi.patch deleted file mode 100644 index 81c5383dc..000000000 --- a/debian/patches-rt/0062-highmem-Don-t-disable-preemption-on-RT-in-kmap_atomi.patch +++ /dev/null @@ -1,71 +0,0 @@ -From f881a329342e9397a55eca767bf0e40e33a93906 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Fri, 30 Oct 2020 13:59:06 +0100 -Subject: [PATCH 062/296] highmem: Don't disable preemption on RT in - kmap_atomic() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Disabling preemption makes it impossible to acquire sleeping locks within -kmap_atomic() section. -For PREEMPT_RT it is sufficient to disable migration. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/highmem-internal.h | 20 ++++++++++++++++---- - 1 file changed, 16 insertions(+), 4 deletions(-) - -diff --git a/include/linux/highmem-internal.h b/include/linux/highmem-internal.h -index bd15bf9164c2..f9bc6acd3679 100644 ---- a/include/linux/highmem-internal.h -+++ b/include/linux/highmem-internal.h -@@ -90,7 +90,10 @@ static inline void __kunmap_local(void *vaddr) - - static inline void *kmap_atomic(struct page *page) - { -- preempt_disable(); -+ if (IS_ENABLED(CONFIG_PREEMPT_RT)) -+ migrate_disable(); -+ else -+ preempt_disable(); - pagefault_disable(); - return __kmap_local_page_prot(page, kmap_prot); - } -@@ -99,7 +102,10 @@ static inline void __kunmap_atomic(void *addr) - { - kunmap_local_indexed(addr); - pagefault_enable(); -- preempt_enable(); -+ if (IS_ENABLED(CONFIG_PREEMPT_RT)) -+ migrate_enable(); -+ else -+ preempt_enable(); - } - - unsigned int __nr_free_highpages(void); -@@ -172,7 +178,10 @@ static inline void __kunmap_local(void *addr) - - static inline void *kmap_atomic(struct page *page) - { -- preempt_disable(); -+ if (IS_ENABLED(CONFIG_PREEMPT_RT)) -+ migrate_disable(); -+ else -+ preempt_disable(); - pagefault_disable(); - return page_address(page); - } -@@ -183,7 +192,10 @@ static inline void __kunmap_atomic(void *addr) - kunmap_flush_on_unmap(addr); - #endif - pagefault_enable(); -- preempt_enable(); -+ if (IS_ENABLED(CONFIG_PREEMPT_RT)) -+ migrate_enable(); -+ else -+ preempt_enable(); - } - - static inline unsigned int nr_free_highpages(void) { return 0; } --- -2.30.2 - diff --git a/debian/patches-rt/0063-timers-Move-clearing-of-base-timer_running-under-bas.patch b/debian/patches-rt/0063-timers-Move-clearing-of-base-timer_running-under-bas.patch deleted file mode 100644 index e9c29c7ef..000000000 --- a/debian/patches-rt/0063-timers-Move-clearing-of-base-timer_running-under-bas.patch +++ /dev/null @@ -1,63 +0,0 @@ -From 54c0d1335177ccb302f0cdf69bebff0f618f5bfe Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Sun, 6 Dec 2020 22:40:07 +0100 -Subject: [PATCH 063/296] timers: Move clearing of base::timer_running under - base::lock -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -syzbot reported KCSAN data races vs. timer_base::timer_running being set to -NULL without holding base::lock in expire_timers(). - -This looks innocent and most reads are clearly not problematic but for a -non-RT kernel it's completely irrelevant whether the store happens before -or after taking the lock. For an RT kernel moving the store under the lock -requires an extra unlock/lock pair in the case that there is a waiter for -the timer. But that's not the end of the world and definitely not worth the -trouble of adding boatloads of comments and annotations to the code. Famous -last words... - -Reported-by: syzbot+aa7c2385d46c5eba0b89@syzkaller.appspotmail.com -Reported-by: syzbot+abea4558531bae1ba9fe@syzkaller.appspotmail.com -Link: https://lkml.kernel.org/r/87lfea7gw8.fsf@nanos.tec.linutronix.de -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Cc: stable-rt@vger.kernel.org ---- - kernel/time/timer.c | 6 ++++-- - 1 file changed, 4 insertions(+), 2 deletions(-) - -diff --git a/kernel/time/timer.c b/kernel/time/timer.c -index c3ad64fb9d8b..02a5812dfc45 100644 ---- a/kernel/time/timer.c -+++ b/kernel/time/timer.c -@@ -1263,8 +1263,10 @@ static inline void timer_base_unlock_expiry(struct timer_base *base) - static void timer_sync_wait_running(struct timer_base *base) - { - if (atomic_read(&base->timer_waiters)) { -+ raw_spin_unlock_irq(&base->lock); - spin_unlock(&base->expiry_lock); - spin_lock(&base->expiry_lock); -+ raw_spin_lock_irq(&base->lock); - } - } - -@@ -1448,14 +1450,14 @@ static void expire_timers(struct timer_base *base, struct hlist_head *head) - if (timer->flags & TIMER_IRQSAFE) { - raw_spin_unlock(&base->lock); - call_timer_fn(timer, fn, baseclk); -- base->running_timer = NULL; - raw_spin_lock(&base->lock); -+ base->running_timer = NULL; - } else { - raw_spin_unlock_irq(&base->lock); - call_timer_fn(timer, fn, baseclk); -+ raw_spin_lock_irq(&base->lock); - base->running_timer = NULL; - timer_sync_wait_running(base); -- raw_spin_lock_irq(&base->lock); - } - } - } --- -2.30.2 - diff --git a/debian/patches-rt/0064-blk-mq-Don-t-complete-on-a-remote-CPU-in-force-threa.patch b/debian/patches-rt/0064-blk-mq-Don-t-complete-on-a-remote-CPU-in-force-threa.patch deleted file mode 100644 index d136bdd45..000000000 --- a/debian/patches-rt/0064-blk-mq-Don-t-complete-on-a-remote-CPU-in-force-threa.patch +++ /dev/null @@ -1,48 +0,0 @@ -From 1ccdfa98cd9670c55c468e377d501eb8ca8f8672 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Fri, 4 Dec 2020 20:13:54 +0100 -Subject: [PATCH 064/296] blk-mq: Don't complete on a remote CPU in force - threaded mode -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -With force threaded interrupts enabled, raising softirq from an SMP -function call will always result in waking the ksoftirqd thread. This is -not optimal given that the thread runs at SCHED_OTHER priority. - -Completing the request in hard IRQ-context on PREEMPT_RT (which enforces -the force threaded mode) is bad because the completion handler may -acquire sleeping locks which violate the locking context. - -Disable request completing on a remote CPU in force threaded mode. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Reviewed-by: Christoph Hellwig <hch@lst.de> -Reviewed-by: Daniel Wagner <dwagner@suse.de> -Signed-off-by: Jens Axboe <axboe@kernel.dk> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - block/blk-mq.c | 8 ++++++++ - 1 file changed, 8 insertions(+) - -diff --git a/block/blk-mq.c b/block/blk-mq.c -index 2a1eff60c797..3e3d015feaec 100644 ---- a/block/blk-mq.c -+++ b/block/blk-mq.c -@@ -648,6 +648,14 @@ static inline bool blk_mq_complete_need_ipi(struct request *rq) - if (!IS_ENABLED(CONFIG_SMP) || - !test_bit(QUEUE_FLAG_SAME_COMP, &rq->q->queue_flags)) - return false; -+ /* -+ * With force threaded interrupts enabled, raising softirq from an SMP -+ * function call will always result in waking the ksoftirqd thread. -+ * This is probably worse than completing the request on a different -+ * cache domain. -+ */ -+ if (force_irqthreads) -+ return false; - - /* same CPU or cache domain? Complete locally */ - if (cpu == rq->mq_ctx->cpu || --- -2.30.2 - diff --git a/debian/patches-rt/0065-blk-mq-Always-complete-remote-completions-requests-i.patch b/debian/patches-rt/0065-blk-mq-Always-complete-remote-completions-requests-i.patch deleted file mode 100644 index fd594d350..000000000 --- a/debian/patches-rt/0065-blk-mq-Always-complete-remote-completions-requests-i.patch +++ /dev/null @@ -1,49 +0,0 @@ -From 10290ae3a50261a3acc577d3c4000d490a7fb03b Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Sat, 23 Jan 2021 21:10:26 +0100 -Subject: [PATCH 065/296] blk-mq: Always complete remote completions requests - in softirq -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Controllers with multiple queues have their IRQ-handelers pinned to a -CPU. The core shouldn't need to complete the request on a remote CPU. - -Remove this case and always raise the softirq to complete the request. - -Reviewed-by: Christoph Hellwig <hch@lst.de> -Reviewed-by: Daniel Wagner <dwagner@suse.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Signed-off-by: Jens Axboe <axboe@kernel.dk> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - block/blk-mq.c | 14 +------------- - 1 file changed, 1 insertion(+), 13 deletions(-) - -diff --git a/block/blk-mq.c b/block/blk-mq.c -index 3e3d015feaec..510323b8c15e 100644 ---- a/block/blk-mq.c -+++ b/block/blk-mq.c -@@ -626,19 +626,7 @@ static void __blk_mq_complete_request_remote(void *data) - { - struct request *rq = data; - -- /* -- * For most of single queue controllers, there is only one irq vector -- * for handling I/O completion, and the only irq's affinity is set -- * to all possible CPUs. On most of ARCHs, this affinity means the irq -- * is handled on one specific CPU. -- * -- * So complete I/O requests in softirq context in case of single queue -- * devices to avoid degrading I/O performance due to irqsoff latency. -- */ -- if (rq->q->nr_hw_queues == 1) -- blk_mq_trigger_softirq(rq); -- else -- rq->q->mq_ops->complete(rq); -+ blk_mq_trigger_softirq(rq); - } - - static inline bool blk_mq_complete_need_ipi(struct request *rq) --- -2.30.2 - diff --git a/debian/patches-rt/0066-blk-mq-Use-llist_head-for-blk_cpu_done.patch b/debian/patches-rt/0066-blk-mq-Use-llist_head-for-blk_cpu_done.patch deleted file mode 100644 index 872e564fd..000000000 --- a/debian/patches-rt/0066-blk-mq-Use-llist_head-for-blk_cpu_done.patch +++ /dev/null @@ -1,201 +0,0 @@ -From 778ed7c53a9e5e7a695d3d9278cc29c4c7affbfd Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Sat, 23 Jan 2021 21:10:27 +0100 -Subject: [PATCH 066/296] blk-mq: Use llist_head for blk_cpu_done -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -With llist_head it is possible to avoid the locking (the irq-off region) -when items are added. This makes it possible to add items on a remote -CPU without additional locking. -llist_add() returns true if the list was previously empty. This can be -used to invoke the SMP function call / raise sofirq only if the first -item was added (otherwise it is already pending). -This simplifies the code a little and reduces the IRQ-off regions. - -blk_mq_raise_softirq() needs a preempt-disable section to ensure the -request is enqueued on the same CPU as the softirq is raised. -Some callers (USB-storage) invoke this path in preemptible context. - -Reviewed-by: Christoph Hellwig <hch@lst.de> -Reviewed-by: Daniel Wagner <dwagner@suse.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Signed-off-by: Jens Axboe <axboe@kernel.dk> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - block/blk-mq.c | 101 ++++++++++++++++++----------------------- - include/linux/blkdev.h | 2 +- - 2 files changed, 44 insertions(+), 59 deletions(-) - -diff --git a/block/blk-mq.c b/block/blk-mq.c -index 510323b8c15e..87dd67e7abdc 100644 ---- a/block/blk-mq.c -+++ b/block/blk-mq.c -@@ -41,7 +41,7 @@ - #include "blk-mq-sched.h" - #include "blk-rq-qos.h" - --static DEFINE_PER_CPU(struct list_head, blk_cpu_done); -+static DEFINE_PER_CPU(struct llist_head, blk_cpu_done); - - static void blk_mq_poll_stats_start(struct request_queue *q); - static void blk_mq_poll_stats_fn(struct blk_stat_callback *cb); -@@ -565,68 +565,29 @@ void blk_mq_end_request(struct request *rq, blk_status_t error) - } - EXPORT_SYMBOL(blk_mq_end_request); - --/* -- * Softirq action handler - move entries to local list and loop over them -- * while passing them to the queue registered handler. -- */ --static __latent_entropy void blk_done_softirq(struct softirq_action *h) -+static void blk_complete_reqs(struct llist_head *list) - { -- struct list_head *cpu_list, local_list; -- -- local_irq_disable(); -- cpu_list = this_cpu_ptr(&blk_cpu_done); -- list_replace_init(cpu_list, &local_list); -- local_irq_enable(); -- -- while (!list_empty(&local_list)) { -- struct request *rq; -+ struct llist_node *entry = llist_reverse_order(llist_del_all(list)); -+ struct request *rq, *next; - -- rq = list_entry(local_list.next, struct request, ipi_list); -- list_del_init(&rq->ipi_list); -+ llist_for_each_entry_safe(rq, next, entry, ipi_list) - rq->q->mq_ops->complete(rq); -- } - } - --static void blk_mq_trigger_softirq(struct request *rq) -+static __latent_entropy void blk_done_softirq(struct softirq_action *h) - { -- struct list_head *list; -- unsigned long flags; -- -- local_irq_save(flags); -- list = this_cpu_ptr(&blk_cpu_done); -- list_add_tail(&rq->ipi_list, list); -- -- /* -- * If the list only contains our just added request, signal a raise of -- * the softirq. If there are already entries there, someone already -- * raised the irq but it hasn't run yet. -- */ -- if (list->next == &rq->ipi_list) -- raise_softirq_irqoff(BLOCK_SOFTIRQ); -- local_irq_restore(flags); -+ blk_complete_reqs(this_cpu_ptr(&blk_cpu_done)); - } - - static int blk_softirq_cpu_dead(unsigned int cpu) - { -- /* -- * If a CPU goes away, splice its entries to the current CPU -- * and trigger a run of the softirq -- */ -- local_irq_disable(); -- list_splice_init(&per_cpu(blk_cpu_done, cpu), -- this_cpu_ptr(&blk_cpu_done)); -- raise_softirq_irqoff(BLOCK_SOFTIRQ); -- local_irq_enable(); -- -+ blk_complete_reqs(&per_cpu(blk_cpu_done, cpu)); - return 0; - } - -- - static void __blk_mq_complete_request_remote(void *data) - { -- struct request *rq = data; -- -- blk_mq_trigger_softirq(rq); -+ __raise_softirq_irqoff(BLOCK_SOFTIRQ); - } - - static inline bool blk_mq_complete_need_ipi(struct request *rq) -@@ -655,6 +616,32 @@ static inline bool blk_mq_complete_need_ipi(struct request *rq) - return cpu_online(rq->mq_ctx->cpu); - } - -+static void blk_mq_complete_send_ipi(struct request *rq) -+{ -+ struct llist_head *list; -+ unsigned int cpu; -+ -+ cpu = rq->mq_ctx->cpu; -+ list = &per_cpu(blk_cpu_done, cpu); -+ if (llist_add(&rq->ipi_list, list)) { -+ rq->csd.func = __blk_mq_complete_request_remote; -+ rq->csd.info = rq; -+ rq->csd.flags = 0; -+ smp_call_function_single_async(cpu, &rq->csd); -+ } -+} -+ -+static void blk_mq_raise_softirq(struct request *rq) -+{ -+ struct llist_head *list; -+ -+ preempt_disable(); -+ list = this_cpu_ptr(&blk_cpu_done); -+ if (llist_add(&rq->ipi_list, list)) -+ raise_softirq(BLOCK_SOFTIRQ); -+ preempt_enable(); -+} -+ - bool blk_mq_complete_request_remote(struct request *rq) - { - WRITE_ONCE(rq->state, MQ_RQ_COMPLETE); -@@ -667,17 +654,15 @@ bool blk_mq_complete_request_remote(struct request *rq) - return false; - - if (blk_mq_complete_need_ipi(rq)) { -- rq->csd.func = __blk_mq_complete_request_remote; -- rq->csd.info = rq; -- rq->csd.flags = 0; -- smp_call_function_single_async(rq->mq_ctx->cpu, &rq->csd); -- } else { -- if (rq->q->nr_hw_queues > 1) -- return false; -- blk_mq_trigger_softirq(rq); -+ blk_mq_complete_send_ipi(rq); -+ return true; - } - -- return true; -+ if (rq->q->nr_hw_queues == 1) { -+ blk_mq_raise_softirq(rq); -+ return true; -+ } -+ return false; - } - EXPORT_SYMBOL_GPL(blk_mq_complete_request_remote); - -@@ -3905,7 +3890,7 @@ static int __init blk_mq_init(void) - int i; - - for_each_possible_cpu(i) -- INIT_LIST_HEAD(&per_cpu(blk_cpu_done, i)); -+ init_llist_head(&per_cpu(blk_cpu_done, i)); - open_softirq(BLOCK_SOFTIRQ, blk_done_softirq); - - cpuhp_setup_state_nocalls(CPUHP_BLOCK_SOFTIRQ_DEAD, -diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h -index 542471b76f41..c53febd7d169 100644 ---- a/include/linux/blkdev.h -+++ b/include/linux/blkdev.h -@@ -153,7 +153,7 @@ struct request { - */ - union { - struct hlist_node hash; /* merge hash */ -- struct list_head ipi_list; -+ struct llist_node ipi_list; - }; - - /* --- -2.30.2 - diff --git a/debian/patches-rt/0067-lib-test_lockup-Minimum-fix-to-get-it-compiled-on-PR.patch b/debian/patches-rt/0067-lib-test_lockup-Minimum-fix-to-get-it-compiled-on-PR.patch deleted file mode 100644 index 28b4c2e4a..000000000 --- a/debian/patches-rt/0067-lib-test_lockup-Minimum-fix-to-get-it-compiled-on-PR.patch +++ /dev/null @@ -1,65 +0,0 @@ -From 9061b3d9379c848a2964534d8bea8f1e0f421f95 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Wed, 28 Oct 2020 18:55:27 +0100 -Subject: [PATCH 067/296] lib/test_lockup: Minimum fix to get it compiled on - PREEMPT_RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -On PREEMPT_RT the locks are quite different so they can't be tested as -it is done below. The alternative is test for the waitlock within -rtmutex. - -This is the bare minim to get it compiled. Problems which exists on -PREEMP_RT: -- none of the locks (spinlock_t, rwlock_t, mutex_t, rw_semaphore) may be - acquired with disabled preemption or interrupts. - If I read the code correct the it is possible to acquire a mutex with - disabled interrupts. - I don't know how to obtain a lock pointer. Technically they are not - exported to userland. - -- memory can not be allocated with disabled premption or interrupts even - with GFP_ATOMIC. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - lib/test_lockup.c | 16 ++++++++++++++++ - 1 file changed, 16 insertions(+) - -diff --git a/lib/test_lockup.c b/lib/test_lockup.c -index f1a020bcc763..864554e76973 100644 ---- a/lib/test_lockup.c -+++ b/lib/test_lockup.c -@@ -480,6 +480,21 @@ static int __init test_lockup_init(void) - return -EINVAL; - - #ifdef CONFIG_DEBUG_SPINLOCK -+#ifdef CONFIG_PREEMPT_RT -+ if (test_magic(lock_spinlock_ptr, -+ offsetof(spinlock_t, lock.wait_lock.magic), -+ SPINLOCK_MAGIC) || -+ test_magic(lock_rwlock_ptr, -+ offsetof(rwlock_t, rtmutex.wait_lock.magic), -+ SPINLOCK_MAGIC) || -+ test_magic(lock_mutex_ptr, -+ offsetof(struct mutex, lock.wait_lock.magic), -+ SPINLOCK_MAGIC) || -+ test_magic(lock_rwsem_ptr, -+ offsetof(struct rw_semaphore, rtmutex.wait_lock.magic), -+ SPINLOCK_MAGIC)) -+ return -EINVAL; -+#else - if (test_magic(lock_spinlock_ptr, - offsetof(spinlock_t, rlock.magic), - SPINLOCK_MAGIC) || -@@ -493,6 +508,7 @@ static int __init test_lockup_init(void) - offsetof(struct rw_semaphore, wait_lock.magic), - SPINLOCK_MAGIC)) - return -EINVAL; -+#endif - #endif - - if ((wait_state != TASK_RUNNING || --- -2.30.2 - diff --git a/debian/patches-rt/0068-timers-Don-t-block-on-expiry_lock-for-TIMER_IRQSAFE.patch b/debian/patches-rt/0068-timers-Don-t-block-on-expiry_lock-for-TIMER_IRQSAFE.patch deleted file mode 100644 index ba9db3dbd..000000000 --- a/debian/patches-rt/0068-timers-Don-t-block-on-expiry_lock-for-TIMER_IRQSAFE.patch +++ /dev/null @@ -1,60 +0,0 @@ -From 948671068371df28129b8c03ac75047b0f420fde Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Mon, 2 Nov 2020 14:14:24 +0100 -Subject: [PATCH 068/296] timers: Don't block on ->expiry_lock for - TIMER_IRQSAFE -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -PREEMPT_RT does not spin and wait until a running timer completes its -callback but instead it blocks on a sleeping lock to prevent a deadlock. - -This blocking can not be done for workqueue's IRQ_SAFE timer which will -be canceled in an IRQ-off region. It has to happen to in IRQ-off region -because changing the PENDING bit and clearing the timer must not be -interrupted to avoid a busy-loop. - -The callback invocation of IRQSAFE timer is not preempted on PREEMPT_RT -so there is no need to synchronize on timer_base::expiry_lock. - -Don't acquire the timer_base::expiry_lock for TIMER_IRQSAFE flagged -timer. -Add a lockdep annotation to ensure that this function is always invoked -in preemptible context on PREEMPT_RT. - -Reported-by: Mike Galbraith <efault@gmx.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Cc: stable-rt@vger.kernel.org ---- - kernel/time/timer.c | 9 ++++++++- - 1 file changed, 8 insertions(+), 1 deletion(-) - -diff --git a/kernel/time/timer.c b/kernel/time/timer.c -index 02a5812dfc45..14d9eb790b31 100644 ---- a/kernel/time/timer.c -+++ b/kernel/time/timer.c -@@ -1285,7 +1285,7 @@ static void del_timer_wait_running(struct timer_list *timer) - u32 tf; - - tf = READ_ONCE(timer->flags); -- if (!(tf & TIMER_MIGRATING)) { -+ if (!(tf & (TIMER_MIGRATING | TIMER_IRQSAFE))) { - struct timer_base *base = get_timer_base(tf); - - /* -@@ -1369,6 +1369,13 @@ int del_timer_sync(struct timer_list *timer) - */ - WARN_ON(in_irq() && !(timer->flags & TIMER_IRQSAFE)); - -+ /* -+ * Must be able to sleep on PREEMPT_RT because of the slowpath in -+ * del_timer_wait_running(). -+ */ -+ if (IS_ENABLED(CONFIG_PREEMPT_RT) && !(timer->flags & TIMER_IRQSAFE)) -+ lockdep_assert_preemption_enabled(); -+ - do { - ret = try_to_del_timer_sync(timer); - --- -2.30.2 - diff --git a/debian/patches-rt/0071-notifier-Make-atomic_notifiers-use-raw_spinlock.patch b/debian/patches-rt/0071-notifier-Make-atomic_notifiers-use-raw_spinlock.patch deleted file mode 100644 index d4775ff71..000000000 --- a/debian/patches-rt/0071-notifier-Make-atomic_notifiers-use-raw_spinlock.patch +++ /dev/null @@ -1,132 +0,0 @@ -From 67f210025660fc2f9057fcc1f7f893d6583f99b7 Mon Sep 17 00:00:00 2001 -From: Valentin Schneider <valentin.schneider@arm.com> -Date: Sun, 22 Nov 2020 20:19:04 +0000 -Subject: [PATCH 071/296] notifier: Make atomic_notifiers use raw_spinlock -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Booting a recent PREEMPT_RT kernel (v5.10-rc3-rt7-rebase) on my arm64 Juno -leads to the idle task blocking on an RT sleeping spinlock down some -notifier path: - - [ 1.809101] BUG: scheduling while atomic: swapper/5/0/0x00000002 - [ 1.809116] Modules linked in: - [ 1.809123] Preemption disabled at: - [ 1.809125] secondary_start_kernel (arch/arm64/kernel/smp.c:227) - [ 1.809146] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G W 5.10.0-rc3-rt7 #168 - [ 1.809153] Hardware name: ARM Juno development board (r0) (DT) - [ 1.809158] Call trace: - [ 1.809160] dump_backtrace (arch/arm64/kernel/stacktrace.c:100 (discriminator 1)) - [ 1.809170] show_stack (arch/arm64/kernel/stacktrace.c:198) - [ 1.809178] dump_stack (lib/dump_stack.c:122) - [ 1.809188] __schedule_bug (kernel/sched/core.c:4886) - [ 1.809197] __schedule (./arch/arm64/include/asm/preempt.h:18 kernel/sched/core.c:4913 kernel/sched/core.c:5040) - [ 1.809204] preempt_schedule_lock (kernel/sched/core.c:5365 (discriminator 1)) - [ 1.809210] rt_spin_lock_slowlock_locked (kernel/locking/rtmutex.c:1072) - [ 1.809217] rt_spin_lock_slowlock (kernel/locking/rtmutex.c:1110) - [ 1.809224] rt_spin_lock (./include/linux/rcupdate.h:647 kernel/locking/rtmutex.c:1139) - [ 1.809231] atomic_notifier_call_chain_robust (kernel/notifier.c:71 kernel/notifier.c:118 kernel/notifier.c:186) - [ 1.809240] cpu_pm_enter (kernel/cpu_pm.c:39 kernel/cpu_pm.c:93) - [ 1.809249] psci_enter_idle_state (drivers/cpuidle/cpuidle-psci.c:52 drivers/cpuidle/cpuidle-psci.c:129) - [ 1.809258] cpuidle_enter_state (drivers/cpuidle/cpuidle.c:238) - [ 1.809267] cpuidle_enter (drivers/cpuidle/cpuidle.c:353) - [ 1.809275] do_idle (kernel/sched/idle.c:132 kernel/sched/idle.c:213 kernel/sched/idle.c:273) - [ 1.809282] cpu_startup_entry (kernel/sched/idle.c:368 (discriminator 1)) - [ 1.809288] secondary_start_kernel (arch/arm64/kernel/smp.c:273) - -Two points worth noting: - -1) That this is conceptually the same issue as pointed out in: - 313c8c16ee62 ("PM / CPU: replace raw_notifier with atomic_notifier") -2) Only the _robust() variant of atomic_notifier callchains suffer from - this - -AFAICT only the cpu_pm_notifier_chain really needs to be changed, but -singling it out would mean introducing a new (truly) non-blocking API. At -the same time, callers that are fine with any blocking within the call -chain should use blocking notifiers, so patching up all atomic_notifier's -doesn't seem *too* crazy to me. - -Fixes: 70d932985757 ("notifier: Fix broken error handling pattern") -Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> -Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com> -Link: https://lkml.kernel.org/r/20201122201904.30940-1-valentin.schneider@arm.com -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/notifier.h | 6 +++--- - kernel/notifier.c | 12 ++++++------ - 2 files changed, 9 insertions(+), 9 deletions(-) - -diff --git a/include/linux/notifier.h b/include/linux/notifier.h -index 2fb373a5c1ed..723bc2df6388 100644 ---- a/include/linux/notifier.h -+++ b/include/linux/notifier.h -@@ -58,7 +58,7 @@ struct notifier_block { - }; - - struct atomic_notifier_head { -- spinlock_t lock; -+ raw_spinlock_t lock; - struct notifier_block __rcu *head; - }; - -@@ -78,7 +78,7 @@ struct srcu_notifier_head { - }; - - #define ATOMIC_INIT_NOTIFIER_HEAD(name) do { \ -- spin_lock_init(&(name)->lock); \ -+ raw_spin_lock_init(&(name)->lock); \ - (name)->head = NULL; \ - } while (0) - #define BLOCKING_INIT_NOTIFIER_HEAD(name) do { \ -@@ -95,7 +95,7 @@ extern void srcu_init_notifier_head(struct srcu_notifier_head *nh); - cleanup_srcu_struct(&(name)->srcu); - - #define ATOMIC_NOTIFIER_INIT(name) { \ -- .lock = __SPIN_LOCK_UNLOCKED(name.lock), \ -+ .lock = __RAW_SPIN_LOCK_UNLOCKED(name.lock), \ - .head = NULL } - #define BLOCKING_NOTIFIER_INIT(name) { \ - .rwsem = __RWSEM_INITIALIZER((name).rwsem), \ -diff --git a/kernel/notifier.c b/kernel/notifier.c -index 1b019cbca594..c20782f07643 100644 ---- a/kernel/notifier.c -+++ b/kernel/notifier.c -@@ -142,9 +142,9 @@ int atomic_notifier_chain_register(struct atomic_notifier_head *nh, - unsigned long flags; - int ret; - -- spin_lock_irqsave(&nh->lock, flags); -+ raw_spin_lock_irqsave(&nh->lock, flags); - ret = notifier_chain_register(&nh->head, n); -- spin_unlock_irqrestore(&nh->lock, flags); -+ raw_spin_unlock_irqrestore(&nh->lock, flags); - return ret; - } - EXPORT_SYMBOL_GPL(atomic_notifier_chain_register); -@@ -164,9 +164,9 @@ int atomic_notifier_chain_unregister(struct atomic_notifier_head *nh, - unsigned long flags; - int ret; - -- spin_lock_irqsave(&nh->lock, flags); -+ raw_spin_lock_irqsave(&nh->lock, flags); - ret = notifier_chain_unregister(&nh->head, n); -- spin_unlock_irqrestore(&nh->lock, flags); -+ raw_spin_unlock_irqrestore(&nh->lock, flags); - synchronize_rcu(); - return ret; - } -@@ -182,9 +182,9 @@ int atomic_notifier_call_chain_robust(struct atomic_notifier_head *nh, - * Musn't use RCU; because then the notifier list can - * change between the up and down traversal. - */ -- spin_lock_irqsave(&nh->lock, flags); -+ raw_spin_lock_irqsave(&nh->lock, flags); - ret = notifier_call_chain_robust(&nh->head, val_up, val_down, v); -- spin_unlock_irqrestore(&nh->lock, flags); -+ raw_spin_unlock_irqrestore(&nh->lock, flags); - - return ret; - } --- -2.30.2 - diff --git a/debian/patches-rt/0072-rcu-Make-RCU_BOOST-default-on-CONFIG_PREEMPT_RT.patch b/debian/patches-rt/0072-rcu-Make-RCU_BOOST-default-on-CONFIG_PREEMPT_RT.patch deleted file mode 100644 index 101badd47..000000000 --- a/debian/patches-rt/0072-rcu-Make-RCU_BOOST-default-on-CONFIG_PREEMPT_RT.patch +++ /dev/null @@ -1,41 +0,0 @@ -From 0faa86a1befe0177145c18cd75609b9098a7f15c Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Tue, 15 Dec 2020 15:16:45 +0100 -Subject: [PATCH 072/296] rcu: Make RCU_BOOST default on CONFIG_PREEMPT_RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -On PREEMPT_RT kernels, RCU callbacks are deferred to the `rcuc' kthread. -This can stall RCU grace periods due to lengthy preemption not only of RCU -readers but also of 'rcuc' kthreads, either of which prevent grace periods -from completing, which can in turn result in OOM. Because PREEMPT_RT -kernels have more kthreads that can block grace periods, it is more -important for such kernels to enable RCU_BOOST. - -This commit therefore makes RCU_BOOST the default on PREEMPT_RT. -RCU_BOOST can still be manually disabled if need be. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Signed-off-by: Paul E. McKenney <paulmck@kernel.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/rcu/Kconfig | 4 ++-- - 1 file changed, 2 insertions(+), 2 deletions(-) - -diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig -index b71e21f73c40..a2534cd1765a 100644 ---- a/kernel/rcu/Kconfig -+++ b/kernel/rcu/Kconfig -@@ -188,8 +188,8 @@ config RCU_FAST_NO_HZ - - config RCU_BOOST - bool "Enable RCU priority boosting" -- depends on RT_MUTEXES && PREEMPT_RCU && RCU_EXPERT -- default n -+ depends on (RT_MUTEXES && PREEMPT_RCU && RCU_EXPERT) || PREEMPT_RT -+ default y if PREEMPT_RT - help - This option boosts the priority of preempted RCU readers that - block the current preemptible RCU grace period for too long. --- -2.30.2 - diff --git a/debian/patches-rt/0073-rcu-Unconditionally-use-rcuc-threads-on-PREEMPT_RT.patch b/debian/patches-rt/0073-rcu-Unconditionally-use-rcuc-threads-on-PREEMPT_RT.patch deleted file mode 100644 index 3dedf2bd5..000000000 --- a/debian/patches-rt/0073-rcu-Unconditionally-use-rcuc-threads-on-PREEMPT_RT.patch +++ /dev/null @@ -1,66 +0,0 @@ -From 44106456999845a5375059b6494a38be7d1f8ff0 Mon Sep 17 00:00:00 2001 -From: Scott Wood <swood@redhat.com> -Date: Tue, 15 Dec 2020 15:16:46 +0100 -Subject: [PATCH 073/296] rcu: Unconditionally use rcuc threads on PREEMPT_RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -PREEMPT_RT systems have long used the rcutree.use_softirq kernel -boot parameter to avoid use of RCU_SOFTIRQ handlers, which can disrupt -real-time applications by invoking callbacks during return from interrupts -that arrived while executing time-critical code. This kernel boot -parameter instead runs RCU core processing in an 'rcuc' kthread, thus -allowing the scheduler to do its job of avoiding disrupting time-critical -code. - -This commit therefore disables the rcutree.use_softirq kernel boot -parameter on PREEMPT_RT systems, thus forcing such systems to do RCU -core processing in 'rcuc' kthreads. This approach has long been in -use by users of the -rt patchset, and there have been no complaints. -There is therefore no way for the system administrator to override this -choice, at least without modifying and rebuilding the kernel. - -Signed-off-by: Scott Wood <swood@redhat.com> -[bigeasy: Reword commit message] -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -[ paulmck: Update kernel-parameters.txt accordingly. ] -Signed-off-by: Paul E. McKenney <paulmck@kernel.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - Documentation/admin-guide/kernel-parameters.txt | 4 ++++ - kernel/rcu/tree.c | 4 +++- - 2 files changed, 7 insertions(+), 1 deletion(-) - -diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt -index 26bfe7ae711b..32cd79a3c3e7 100644 ---- a/Documentation/admin-guide/kernel-parameters.txt -+++ b/Documentation/admin-guide/kernel-parameters.txt -@@ -4085,6 +4085,10 @@ - value, meaning that RCU_SOFTIRQ is used by default. - Specify rcutree.use_softirq=0 to use rcuc kthreads. - -+ But note that CONFIG_PREEMPT_RT=y kernels disable -+ this kernel boot parameter, forcibly setting it -+ to zero. -+ - rcutree.rcu_fanout_exact= [KNL] - Disable autobalancing of the rcu_node combining - tree. This is used by rcutorture, and might -diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c -index 5dc36c6e80fd..782a3152bafc 100644 ---- a/kernel/rcu/tree.c -+++ b/kernel/rcu/tree.c -@@ -100,8 +100,10 @@ static struct rcu_state rcu_state = { - static bool dump_tree; - module_param(dump_tree, bool, 0444); - /* By default, use RCU_SOFTIRQ instead of rcuc kthreads. */ --static bool use_softirq = true; -+static bool use_softirq = !IS_ENABLED(CONFIG_PREEMPT_RT); -+#ifndef CONFIG_PREEMPT_RT - module_param(use_softirq, bool, 0444); -+#endif - /* Control rcu_node-tree auto-balancing at boot time. */ - static bool rcu_fanout_exact; - module_param(rcu_fanout_exact, bool, 0444); --- -2.30.2 - diff --git a/debian/patches-rt/0074-rcu-Enable-rcu_normal_after_boot-unconditionally-for.patch b/debian/patches-rt/0074-rcu-Enable-rcu_normal_after_boot-unconditionally-for.patch deleted file mode 100644 index 88be9a3ec..000000000 --- a/debian/patches-rt/0074-rcu-Enable-rcu_normal_after_boot-unconditionally-for.patch +++ /dev/null @@ -1,72 +0,0 @@ -From b37d7fcf02f6efc49ecd7b96aed1342d578d4fc5 Mon Sep 17 00:00:00 2001 -From: Julia Cartwright <julia@ni.com> -Date: Tue, 15 Dec 2020 15:16:47 +0100 -Subject: [PATCH 074/296] rcu: Enable rcu_normal_after_boot unconditionally for - RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Expedited RCU grace periods send IPIs to all non-idle CPUs, and thus can -disrupt time-critical code in real-time applications. However, there -is a portion of boot-time processing (presumably before any real-time -applications have started) where expedited RCU grace periods are the only -option. And so it is that experience with the -rt patchset indicates that -PREEMPT_RT systems should always set the rcupdate.rcu_normal_after_boot -kernel boot parameter. - -This commit therefore makes the post-boot application environment safe -for real-time applications by making PREEMPT_RT systems disable the -rcupdate.rcu_normal_after_boot kernel boot parameter and acting as -if this parameter had been set. This means that post-boot calls to -synchronize_rcu_expedited() will be treated as if they were instead -calls to synchronize_rcu(), thus preventing the IPIs, and thus avoiding -disrupting real-time applications. - -Suggested-by: Luiz Capitulino <lcapitulino@redhat.com> -Acked-by: Paul E. McKenney <paulmck@linux.ibm.com> -Signed-off-by: Julia Cartwright <julia@ni.com> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -[ paulmck: Update kernel-parameters.txt accordingly. ] -Signed-off-by: Paul E. McKenney <paulmck@kernel.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - Documentation/admin-guide/kernel-parameters.txt | 7 +++++++ - kernel/rcu/update.c | 4 +++- - 2 files changed, 10 insertions(+), 1 deletion(-) - -diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt -index 32cd79a3c3e7..9ee9d99cd811 100644 ---- a/Documentation/admin-guide/kernel-parameters.txt -+++ b/Documentation/admin-guide/kernel-parameters.txt -@@ -4467,6 +4467,13 @@ - only normal grace-period primitives. No effect - on CONFIG_TINY_RCU kernels. - -+ But note that CONFIG_PREEMPT_RT=y kernels enables -+ this kernel boot parameter, forcibly setting -+ it to the value one, that is, converting any -+ post-boot attempt at an expedited RCU grace -+ period to instead use normal non-expedited -+ grace-period processing. -+ - rcupdate.rcu_task_ipi_delay= [KNL] - Set time in jiffies during which RCU tasks will - avoid sending IPIs, starting with the beginning -diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c -index 39334d2d2b37..b95ae86c40a7 100644 ---- a/kernel/rcu/update.c -+++ b/kernel/rcu/update.c -@@ -56,8 +56,10 @@ - #ifndef CONFIG_TINY_RCU - module_param(rcu_expedited, int, 0); - module_param(rcu_normal, int, 0); --static int rcu_normal_after_boot; -+static int rcu_normal_after_boot = IS_ENABLED(CONFIG_PREEMPT_RT); -+#ifndef CONFIG_PREEMPT_RT - module_param(rcu_normal_after_boot, int, 0); -+#endif - #endif /* #ifndef CONFIG_TINY_RCU */ - - #ifdef CONFIG_DEBUG_LOCK_ALLOC --- -2.30.2 - diff --git a/debian/patches-rt/0075-doc-Update-RCU-s-requirements-page-about-the-PREEMPT.patch b/debian/patches-rt/0075-doc-Update-RCU-s-requirements-page-about-the-PREEMPT.patch deleted file mode 100644 index 9706a08d5..000000000 --- a/debian/patches-rt/0075-doc-Update-RCU-s-requirements-page-about-the-PREEMPT.patch +++ /dev/null @@ -1,35 +0,0 @@ -From 925d2ea775140190dac187c0cd641acefadf3c8e Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Tue, 15 Dec 2020 15:16:48 +0100 -Subject: [PATCH 075/296] doc: Update RCU's requirements page about the - PREEMPT_RT wiki. -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The PREEMPT_RT wiki moved from kernel.org to the Linux Foundation wiki. -The kernel.org wiki is read only. - -This commit therefore updates the URL of the active PREEMPT_RT wiki. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Signed-off-by: Paul E. McKenney <paulmck@kernel.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - Documentation/RCU/Design/Requirements/Requirements.rst | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst -index 1ae79a10a8de..0f7e0237ea14 100644 ---- a/Documentation/RCU/Design/Requirements/Requirements.rst -+++ b/Documentation/RCU/Design/Requirements/Requirements.rst -@@ -2289,7 +2289,7 @@ decides to throw at it. - - The Linux kernel is used for real-time workloads, especially in - conjunction with the `-rt --patchset <https://rt.wiki.kernel.org/index.php/Main_Page>`__. The -+patchset <https://wiki.linuxfoundation.org/realtime/>`__. The - real-time-latency response requirements are such that the traditional - approach of disabling preemption across RCU read-side critical sections - is inappropriate. Kernels built with ``CONFIG_PREEMPT=y`` therefore use --- -2.30.2 - diff --git a/debian/patches-rt/0076-doc-Use-CONFIG_PREEMPTION.patch b/debian/patches-rt/0076-doc-Use-CONFIG_PREEMPTION.patch deleted file mode 100644 index 56fda1bce..000000000 --- a/debian/patches-rt/0076-doc-Use-CONFIG_PREEMPTION.patch +++ /dev/null @@ -1,250 +0,0 @@ -From e41ad861f5ae52b5544f33ed5022e4634b300829 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Tue, 15 Dec 2020 15:16:49 +0100 -Subject: [PATCH 076/296] doc: Use CONFIG_PREEMPTION -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -CONFIG_PREEMPTION is selected by CONFIG_PREEMPT and by CONFIG_PREEMPT_RT. -Both PREEMPT and PREEMPT_RT require the same functionality which today -depends on CONFIG_PREEMPT. - -Update the documents and mention CONFIG_PREEMPTION. Spell out -CONFIG_PREEMPT_RT (instead PREEMPT_RT) since it is an option now. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Signed-off-by: Paul E. McKenney <paulmck@kernel.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - .../Expedited-Grace-Periods.rst | 4 ++-- - .../RCU/Design/Requirements/Requirements.rst | 24 +++++++++---------- - Documentation/RCU/checklist.rst | 2 +- - Documentation/RCU/rcubarrier.rst | 6 ++--- - Documentation/RCU/stallwarn.rst | 4 ++-- - Documentation/RCU/whatisRCU.rst | 10 ++++---- - 6 files changed, 25 insertions(+), 25 deletions(-) - -diff --git a/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst b/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst -index 72f0f6fbd53c..6f89cf1e567d 100644 ---- a/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst -+++ b/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst -@@ -38,7 +38,7 @@ sections. - RCU-preempt Expedited Grace Periods - =================================== - --``CONFIG_PREEMPT=y`` kernels implement RCU-preempt. -+``CONFIG_PREEMPTION=y`` kernels implement RCU-preempt. - The overall flow of the handling of a given CPU by an RCU-preempt - expedited grace period is shown in the following diagram: - -@@ -112,7 +112,7 @@ things. - RCU-sched Expedited Grace Periods - --------------------------------- - --``CONFIG_PREEMPT=n`` kernels implement RCU-sched. The overall flow of -+``CONFIG_PREEMPTION=n`` kernels implement RCU-sched. The overall flow of - the handling of a given CPU by an RCU-sched expedited grace period is - shown in the following diagram: - -diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst -index 0f7e0237ea14..17d38480ef5c 100644 ---- a/Documentation/RCU/Design/Requirements/Requirements.rst -+++ b/Documentation/RCU/Design/Requirements/Requirements.rst -@@ -78,7 +78,7 @@ RCU treats a nested set as one big RCU read-side critical section. - Production-quality implementations of ``rcu_read_lock()`` and - ``rcu_read_unlock()`` are extremely lightweight, and in fact have - exactly zero overhead in Linux kernels built for production use with --``CONFIG_PREEMPT=n``. -+``CONFIG_PREEMPTION=n``. - - This guarantee allows ordering to be enforced with extremely low - overhead to readers, for example: -@@ -1182,7 +1182,7 @@ and has become decreasingly so as memory sizes have expanded and memory - costs have plummeted. However, as I learned from Matt Mackall's - `bloatwatch <http://elinux.org/Linux_Tiny-FAQ>`__ efforts, memory - footprint is critically important on single-CPU systems with --non-preemptible (``CONFIG_PREEMPT=n``) kernels, and thus `tiny -+non-preemptible (``CONFIG_PREEMPTION=n``) kernels, and thus `tiny - RCU <https://lkml.kernel.org/g/20090113221724.GA15307@linux.vnet.ibm.com>`__ - was born. Josh Triplett has since taken over the small-memory banner - with his `Linux kernel tinification <https://tiny.wiki.kernel.org/>`__ -@@ -1498,7 +1498,7 @@ limitations. - - Implementations of RCU for which ``rcu_read_lock()`` and - ``rcu_read_unlock()`` generate no code, such as Linux-kernel RCU when --``CONFIG_PREEMPT=n``, can be nested arbitrarily deeply. After all, there -+``CONFIG_PREEMPTION=n``, can be nested arbitrarily deeply. After all, there - is no overhead. Except that if all these instances of - ``rcu_read_lock()`` and ``rcu_read_unlock()`` are visible to the - compiler, compilation will eventually fail due to exhausting memory, -@@ -1771,7 +1771,7 @@ implementation can be a no-op. - - However, once the scheduler has spawned its first kthread, this early - boot trick fails for ``synchronize_rcu()`` (as well as for --``synchronize_rcu_expedited()``) in ``CONFIG_PREEMPT=y`` kernels. The -+``synchronize_rcu_expedited()``) in ``CONFIG_PREEMPTION=y`` kernels. The - reason is that an RCU read-side critical section might be preempted, - which means that a subsequent ``synchronize_rcu()`` really does have to - wait for something, as opposed to simply returning immediately. -@@ -2010,7 +2010,7 @@ the following: - 5 rcu_read_unlock(); - 6 do_something_with(v, user_v); - --If the compiler did make this transformation in a ``CONFIG_PREEMPT=n`` kernel -+If the compiler did make this transformation in a ``CONFIG_PREEMPTION=n`` kernel - build, and if ``get_user()`` did page fault, the result would be a quiescent - state in the middle of an RCU read-side critical section. This misplaced - quiescent state could result in line 4 being a use-after-free access, -@@ -2292,7 +2292,7 @@ conjunction with the `-rt - patchset <https://wiki.linuxfoundation.org/realtime/>`__. The - real-time-latency response requirements are such that the traditional - approach of disabling preemption across RCU read-side critical sections --is inappropriate. Kernels built with ``CONFIG_PREEMPT=y`` therefore use -+is inappropriate. Kernels built with ``CONFIG_PREEMPTION=y`` therefore use - an RCU implementation that allows RCU read-side critical sections to be - preempted. This requirement made its presence known after users made it - clear that an earlier `real-time -@@ -2414,7 +2414,7 @@ includes ``rcu_read_lock_bh()``, ``rcu_read_unlock_bh()``, - ``call_rcu_bh()``, ``rcu_barrier_bh()``, and - ``rcu_read_lock_bh_held()``. However, the update-side APIs are now - simple wrappers for other RCU flavors, namely RCU-sched in --CONFIG_PREEMPT=n kernels and RCU-preempt otherwise. -+CONFIG_PREEMPTION=n kernels and RCU-preempt otherwise. - - Sched Flavor (Historical) - ~~~~~~~~~~~~~~~~~~~~~~~~~ -@@ -2432,11 +2432,11 @@ not have this property, given that any point in the code outside of an - RCU read-side critical section can be a quiescent state. Therefore, - *RCU-sched* was created, which follows “classic” RCU in that an - RCU-sched grace period waits for pre-existing interrupt and NMI --handlers. In kernels built with ``CONFIG_PREEMPT=n``, the RCU and -+handlers. In kernels built with ``CONFIG_PREEMPTION=n``, the RCU and - RCU-sched APIs have identical implementations, while kernels built with --``CONFIG_PREEMPT=y`` provide a separate implementation for each. -+``CONFIG_PREEMPTION=y`` provide a separate implementation for each. - --Note well that in ``CONFIG_PREEMPT=y`` kernels, -+Note well that in ``CONFIG_PREEMPTION=y`` kernels, - ``rcu_read_lock_sched()`` and ``rcu_read_unlock_sched()`` disable and - re-enable preemption, respectively. This means that if there was a - preemption attempt during the RCU-sched read-side critical section, -@@ -2599,10 +2599,10 @@ userspace execution also delimit tasks-RCU read-side critical sections. - - The tasks-RCU API is quite compact, consisting only of - ``call_rcu_tasks()``, ``synchronize_rcu_tasks()``, and --``rcu_barrier_tasks()``. In ``CONFIG_PREEMPT=n`` kernels, trampolines -+``rcu_barrier_tasks()``. In ``CONFIG_PREEMPTION=n`` kernels, trampolines - cannot be preempted, so these APIs map to ``call_rcu()``, - ``synchronize_rcu()``, and ``rcu_barrier()``, respectively. In --``CONFIG_PREEMPT=y`` kernels, trampolines can be preempted, and these -+``CONFIG_PREEMPTION=y`` kernels, trampolines can be preempted, and these - three APIs are therefore implemented by separate functions that check - for voluntary context switches. - -diff --git a/Documentation/RCU/checklist.rst b/Documentation/RCU/checklist.rst -index 2efed9926c3f..7ed4956043bd 100644 ---- a/Documentation/RCU/checklist.rst -+++ b/Documentation/RCU/checklist.rst -@@ -214,7 +214,7 @@ over a rather long period of time, but improvements are always welcome! - the rest of the system. - - 7. As of v4.20, a given kernel implements only one RCU flavor, -- which is RCU-sched for PREEMPT=n and RCU-preempt for PREEMPT=y. -+ which is RCU-sched for PREEMPTION=n and RCU-preempt for PREEMPTION=y. - If the updater uses call_rcu() or synchronize_rcu(), - then the corresponding readers my use rcu_read_lock() and - rcu_read_unlock(), rcu_read_lock_bh() and rcu_read_unlock_bh(), -diff --git a/Documentation/RCU/rcubarrier.rst b/Documentation/RCU/rcubarrier.rst -index f64f4413a47c..3b4a24877496 100644 ---- a/Documentation/RCU/rcubarrier.rst -+++ b/Documentation/RCU/rcubarrier.rst -@@ -9,7 +9,7 @@ RCU (read-copy update) is a synchronization mechanism that can be thought - of as a replacement for read-writer locking (among other things), but with - very low-overhead readers that are immune to deadlock, priority inversion, - and unbounded latency. RCU read-side critical sections are delimited --by rcu_read_lock() and rcu_read_unlock(), which, in non-CONFIG_PREEMPT -+by rcu_read_lock() and rcu_read_unlock(), which, in non-CONFIG_PREEMPTION - kernels, generate no code whatsoever. - - This means that RCU writers are unaware of the presence of concurrent -@@ -329,10 +329,10 @@ Answer: This cannot happen. The reason is that on_each_cpu() has its last - to smp_call_function() and further to smp_call_function_on_cpu(), - causing this latter to spin until the cross-CPU invocation of - rcu_barrier_func() has completed. This by itself would prevent -- a grace period from completing on non-CONFIG_PREEMPT kernels, -+ a grace period from completing on non-CONFIG_PREEMPTION kernels, - since each CPU must undergo a context switch (or other quiescent - state) before the grace period can complete. However, this is -- of no use in CONFIG_PREEMPT kernels. -+ of no use in CONFIG_PREEMPTION kernels. - - Therefore, on_each_cpu() disables preemption across its call - to smp_call_function() and also across the local call to -diff --git a/Documentation/RCU/stallwarn.rst b/Documentation/RCU/stallwarn.rst -index c9ab6af4d3be..e97d1b4876ef 100644 ---- a/Documentation/RCU/stallwarn.rst -+++ b/Documentation/RCU/stallwarn.rst -@@ -25,7 +25,7 @@ warnings: - - - A CPU looping with bottom halves disabled. - --- For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel -+- For !CONFIG_PREEMPTION kernels, a CPU looping anywhere in the kernel - without invoking schedule(). If the looping in the kernel is - really expected and desirable behavior, you might need to add - some calls to cond_resched(). -@@ -44,7 +44,7 @@ warnings: - result in the ``rcu_.*kthread starved for`` console-log message, - which will include additional debugging information. - --- A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might -+- A CPU-bound real-time task in a CONFIG_PREEMPTION kernel, which might - happen to preempt a low-priority task in the middle of an RCU - read-side critical section. This is especially damaging if - that low-priority task is not permitted to run on any other CPU, -diff --git a/Documentation/RCU/whatisRCU.rst b/Documentation/RCU/whatisRCU.rst -index fb3ff76c3e73..3b2b1479fd0f 100644 ---- a/Documentation/RCU/whatisRCU.rst -+++ b/Documentation/RCU/whatisRCU.rst -@@ -684,7 +684,7 @@ Quick Quiz #1: - ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - This section presents a "toy" RCU implementation that is based on - "classic RCU". It is also short on performance (but only for updates) and --on features such as hotplug CPU and the ability to run in CONFIG_PREEMPT -+on features such as hotplug CPU and the ability to run in CONFIG_PREEMPTION - kernels. The definitions of rcu_dereference() and rcu_assign_pointer() - are the same as those shown in the preceding section, so they are omitted. - :: -@@ -740,7 +740,7 @@ Quick Quiz #2: - Quick Quiz #3: - If it is illegal to block in an RCU read-side - critical section, what the heck do you do in -- PREEMPT_RT, where normal spinlocks can block??? -+ CONFIG_PREEMPT_RT, where normal spinlocks can block??? - - :ref:`Answers to Quick Quiz <8_whatisRCU>` - -@@ -1094,7 +1094,7 @@ Quick Quiz #2: - overhead is **negative**. - - Answer: -- Imagine a single-CPU system with a non-CONFIG_PREEMPT -+ Imagine a single-CPU system with a non-CONFIG_PREEMPTION - kernel where a routing table is used by process-context - code, but can be updated by irq-context code (for example, - by an "ICMP REDIRECT" packet). The usual way of handling -@@ -1121,10 +1121,10 @@ Answer: - Quick Quiz #3: - If it is illegal to block in an RCU read-side - critical section, what the heck do you do in -- PREEMPT_RT, where normal spinlocks can block??? -+ CONFIG_PREEMPT_RT, where normal spinlocks can block??? - - Answer: -- Just as PREEMPT_RT permits preemption of spinlock -+ Just as CONFIG_PREEMPT_RT permits preemption of spinlock - critical sections, it permits preemption of RCU - read-side critical sections. It also permits - spinlocks blocking while in RCU read-side critical --- -2.30.2 - diff --git a/debian/patches-rt/0077-tracing-Merge-irqflags-preempt-counter.patch b/debian/patches-rt/0077-tracing-Merge-irqflags-preempt-counter.patch deleted file mode 100644 index 9ce19ff35..000000000 --- a/debian/patches-rt/0077-tracing-Merge-irqflags-preempt-counter.patch +++ /dev/null @@ -1,1863 +0,0 @@ -From 4015a7e2437d1484c7774c178c77ba6f788446a0 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Wed, 3 Feb 2021 11:05:23 -0500 -Subject: [PATCH 077/296] tracing: Merge irqflags + preempt counter. -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The state of the interrupts (irqflags) and the preemption counter are -both passed down to tracing_generic_entry_update(). Only one bit of -irqflags is actually required: The on/off state. The complete 32bit -of the preemption counter isn't needed. Just whether of the upper bits -(softirq, hardirq and NMI) are set and the preemption depth is needed. - -The irqflags and the preemption counter could be evaluated early and the -information stored in an integer `trace_ctx'. -tracing_generic_entry_update() would use the upper bits as the -TRACE_FLAG_* and the lower 8bit as the disabled-preemption depth -(considering that one must be substracted from the counter in one -special cases). - -The actual preemption value is not used except for the tracing record. -The `irqflags' variable is mostly used only for the tracing record. An -exception here is for instance wakeup_tracer_call() or -probe_wakeup_sched_switch() which explicilty disable interrupts and use -that `irqflags' to save (and restore) the IRQ state and to record the -state. - -Struct trace_event_buffer has also the `pc' and flags' members which can -be replaced with `trace_ctx' since their actual value is not used -outside of trace recording. - -This will reduce tracing_generic_entry_update() to simply assign values -to struct trace_entry. The evaluation of the TRACE_FLAG_* bits is moved -to _tracing_gen_ctx_flags() which replaces preempt_count() and -local_save_flags() invocations. - -As an example, ftrace_syscall_enter() may invoke: -- trace_buffer_lock_reserve() -> … -> tracing_generic_entry_update() -- event_trigger_unlock_commit() - -> ftrace_trace_stack() -> … -> tracing_generic_entry_update() - -> ftrace_trace_userstack() -> … -> tracing_generic_entry_update() - -In this case the TRACE_FLAG_* bits were evaluated three times. By using -the `trace_ctx' they are evaluated once and assigned three times. - -A build with all tracers enabled on x86-64 with and without the patch: - - text data bss dec hex filename -21970669 17084168 7639260 46694097 2c87ed1 vmlinux.old -21970293 17084168 7639260 46693721 2c87d59 vmlinux.new - -text shrank by 379 bytes, data remained constant. - -Link: https://lkml.kernel.org/r/20210125194511.3924915-2-bigeasy@linutronix.de - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/trace_events.h | 25 +++- - kernel/trace/blktrace.c | 17 +-- - kernel/trace/trace.c | 206 +++++++++++++++------------ - kernel/trace/trace.h | 38 +++-- - kernel/trace/trace_branch.c | 6 +- - kernel/trace/trace_event_perf.c | 5 +- - kernel/trace/trace_events.c | 18 +-- - kernel/trace/trace_events_inject.c | 6 +- - kernel/trace/trace_functions.c | 28 ++-- - kernel/trace/trace_functions_graph.c | 32 ++--- - kernel/trace/trace_hwlat.c | 7 +- - kernel/trace/trace_irqsoff.c | 86 +++++------ - kernel/trace/trace_kprobe.c | 10 +- - kernel/trace/trace_mmiotrace.c | 14 +- - kernel/trace/trace_sched_wakeup.c | 71 +++++---- - kernel/trace/trace_syscalls.c | 20 ++- - kernel/trace/trace_uprobe.c | 4 +- - 17 files changed, 286 insertions(+), 307 deletions(-) - ---- a/include/linux/trace_events.h -+++ b/include/linux/trace_events.h -@@ -148,17 +148,29 @@ - - enum print_line_t trace_handle_return(struct trace_seq *s); - --void tracing_generic_entry_update(struct trace_entry *entry, -- unsigned short type, -- unsigned long flags, -- int pc); -+static inline void tracing_generic_entry_update(struct trace_entry *entry, -+ unsigned short type, -+ unsigned int trace_ctx) -+{ -+ struct task_struct *tsk = current; -+ -+ entry->preempt_count = trace_ctx & 0xff; -+ entry->pid = (tsk) ? tsk->pid : 0; -+ entry->type = type; -+ entry->flags = trace_ctx >> 16; -+} -+ -+unsigned int tracing_gen_ctx_flags(unsigned long irqflags); -+unsigned int tracing_gen_ctx(void); -+unsigned int tracing_gen_ctx_dec(void); -+ - struct trace_event_file; - - struct ring_buffer_event * - trace_event_buffer_lock_reserve(struct trace_buffer **current_buffer, - struct trace_event_file *trace_file, - int type, unsigned long len, -- unsigned long flags, int pc); -+ unsigned int trace_ctx); - - #define TRACE_RECORD_CMDLINE BIT(0) - #define TRACE_RECORD_TGID BIT(1) -@@ -232,8 +244,7 @@ - struct ring_buffer_event *event; - struct trace_event_file *trace_file; - void *entry; -- unsigned long flags; -- int pc; -+ unsigned int trace_ctx; - struct pt_regs *regs; - }; - ---- a/kernel/trace/blktrace.c -+++ b/kernel/trace/blktrace.c -@@ -72,17 +72,17 @@ - struct blk_io_trace *t; - struct ring_buffer_event *event = NULL; - struct trace_buffer *buffer = NULL; -- int pc = 0; -+ unsigned int trace_ctx = 0; - int cpu = smp_processor_id(); - bool blk_tracer = blk_tracer_enabled; - ssize_t cgid_len = cgid ? sizeof(cgid) : 0; - - if (blk_tracer) { - buffer = blk_tr->array_buffer.buffer; -- pc = preempt_count(); -+ trace_ctx = tracing_gen_ctx_flags(0); - event = trace_buffer_lock_reserve(buffer, TRACE_BLK, - sizeof(*t) + len + cgid_len, -- 0, pc); -+ trace_ctx); - if (!event) - return; - t = ring_buffer_event_data(event); -@@ -107,7 +107,7 @@ - memcpy((void *) t + sizeof(*t) + cgid_len, data, len); - - if (blk_tracer) -- trace_buffer_unlock_commit(blk_tr, buffer, event, 0, pc); -+ trace_buffer_unlock_commit(blk_tr, buffer, event, trace_ctx); - } - } - -@@ -222,8 +222,9 @@ - struct blk_io_trace *t; - unsigned long flags = 0; - unsigned long *sequence; -+ unsigned int trace_ctx = 0; - pid_t pid; -- int cpu, pc = 0; -+ int cpu; - bool blk_tracer = blk_tracer_enabled; - ssize_t cgid_len = cgid ? sizeof(cgid) : 0; - -@@ -252,10 +253,10 @@ - tracing_record_cmdline(current); - - buffer = blk_tr->array_buffer.buffer; -- pc = preempt_count(); -+ trace_ctx = tracing_gen_ctx_flags(0); - event = trace_buffer_lock_reserve(buffer, TRACE_BLK, - sizeof(*t) + pdu_len + cgid_len, -- 0, pc); -+ trace_ctx); - if (!event) - return; - t = ring_buffer_event_data(event); -@@ -301,7 +302,7 @@ - memcpy((void *)t + sizeof(*t) + cgid_len, pdu_data, pdu_len); - - if (blk_tracer) { -- trace_buffer_unlock_commit(blk_tr, buffer, event, 0, pc); -+ trace_buffer_unlock_commit(blk_tr, buffer, event, trace_ctx); - return; - } - } ---- a/kernel/trace/trace.c -+++ b/kernel/trace/trace.c -@@ -176,7 +176,7 @@ - int tracing_set_tracer(struct trace_array *tr, const char *buf); - static void ftrace_trace_userstack(struct trace_array *tr, - struct trace_buffer *buffer, -- unsigned long flags, int pc); -+ unsigned int trace_ctx); - - #define MAX_TRACER_SIZE 100 - static char bootup_tracer_buf[MAX_TRACER_SIZE] __initdata; -@@ -905,23 +905,23 @@ - - #ifdef CONFIG_STACKTRACE - static void __ftrace_trace_stack(struct trace_buffer *buffer, -- unsigned long flags, -- int skip, int pc, struct pt_regs *regs); -+ unsigned int trace_ctx, -+ int skip, struct pt_regs *regs); - static inline void ftrace_trace_stack(struct trace_array *tr, - struct trace_buffer *buffer, -- unsigned long flags, -- int skip, int pc, struct pt_regs *regs); -+ unsigned int trace_ctx, -+ int skip, struct pt_regs *regs); - - #else - static inline void __ftrace_trace_stack(struct trace_buffer *buffer, -- unsigned long flags, -- int skip, int pc, struct pt_regs *regs) -+ unsigned int trace_ctx, -+ int skip, struct pt_regs *regs) - { - } - static inline void ftrace_trace_stack(struct trace_array *tr, - struct trace_buffer *buffer, -- unsigned long flags, -- int skip, int pc, struct pt_regs *regs) -+ unsigned long trace_ctx, -+ int skip, struct pt_regs *regs) - { - } - -@@ -929,24 +929,24 @@ - - static __always_inline void - trace_event_setup(struct ring_buffer_event *event, -- int type, unsigned long flags, int pc) -+ int type, unsigned int trace_ctx) - { - struct trace_entry *ent = ring_buffer_event_data(event); - -- tracing_generic_entry_update(ent, type, flags, pc); -+ tracing_generic_entry_update(ent, type, trace_ctx); - } - - static __always_inline struct ring_buffer_event * - __trace_buffer_lock_reserve(struct trace_buffer *buffer, - int type, - unsigned long len, -- unsigned long flags, int pc) -+ unsigned int trace_ctx) - { - struct ring_buffer_event *event; - - event = ring_buffer_lock_reserve(buffer, len); - if (event != NULL) -- trace_event_setup(event, type, flags, pc); -+ trace_event_setup(event, type, trace_ctx); - - return event; - } -@@ -1007,25 +1007,22 @@ - struct ring_buffer_event *event; - struct trace_buffer *buffer; - struct print_entry *entry; -- unsigned long irq_flags; -+ unsigned int trace_ctx; - int alloc; -- int pc; - - if (!(global_trace.trace_flags & TRACE_ITER_PRINTK)) - return 0; - -- pc = preempt_count(); -- - if (unlikely(tracing_selftest_running || tracing_disabled)) - return 0; - - alloc = sizeof(*entry) + size + 2; /* possible \n added */ - -- local_save_flags(irq_flags); -+ trace_ctx = tracing_gen_ctx(); - buffer = global_trace.array_buffer.buffer; - ring_buffer_nest_start(buffer); -- event = __trace_buffer_lock_reserve(buffer, TRACE_PRINT, alloc, -- irq_flags, pc); -+ event = __trace_buffer_lock_reserve(buffer, TRACE_PRINT, alloc, -+ trace_ctx); - if (!event) { - size = 0; - goto out; -@@ -1044,7 +1041,7 @@ - entry->buf[size] = '\0'; - - __buffer_unlock_commit(buffer, event); -- ftrace_trace_stack(&global_trace, buffer, irq_flags, 4, pc, NULL); -+ ftrace_trace_stack(&global_trace, buffer, trace_ctx, 4, NULL); - out: - ring_buffer_nest_end(buffer); - return size; -@@ -1061,25 +1058,22 @@ - struct ring_buffer_event *event; - struct trace_buffer *buffer; - struct bputs_entry *entry; -- unsigned long irq_flags; -+ unsigned int trace_ctx; - int size = sizeof(struct bputs_entry); - int ret = 0; -- int pc; - - if (!(global_trace.trace_flags & TRACE_ITER_PRINTK)) - return 0; - -- pc = preempt_count(); -- - if (unlikely(tracing_selftest_running || tracing_disabled)) - return 0; - -- local_save_flags(irq_flags); -+ trace_ctx = tracing_gen_ctx(); - buffer = global_trace.array_buffer.buffer; - - ring_buffer_nest_start(buffer); - event = __trace_buffer_lock_reserve(buffer, TRACE_BPUTS, size, -- irq_flags, pc); -+ trace_ctx); - if (!event) - goto out; - -@@ -1088,7 +1082,7 @@ - entry->str = str; - - __buffer_unlock_commit(buffer, event); -- ftrace_trace_stack(&global_trace, buffer, irq_flags, 4, pc, NULL); -+ ftrace_trace_stack(&global_trace, buffer, trace_ctx, 4, NULL); - - ret = 1; - out: -@@ -2573,36 +2567,69 @@ - } - EXPORT_SYMBOL_GPL(trace_handle_return); - --void --tracing_generic_entry_update(struct trace_entry *entry, unsigned short type, -- unsigned long flags, int pc) -+unsigned int tracing_gen_ctx_flags(unsigned long irqflags) - { -- struct task_struct *tsk = current; -+ unsigned int trace_flags = 0; -+ unsigned int pc; -+ -+ pc = preempt_count(); - -- entry->preempt_count = pc & 0xff; -- entry->pid = (tsk) ? tsk->pid : 0; -- entry->type = type; -- entry->flags = - #ifdef CONFIG_TRACE_IRQFLAGS_SUPPORT -- (irqs_disabled_flags(flags) ? TRACE_FLAG_IRQS_OFF : 0) | -+ if (irqs_disabled_flags(irqflags)) -+ trace_flags |= TRACE_FLAG_IRQS_OFF; - #else -- TRACE_FLAG_IRQS_NOSUPPORT | -+ trace_flags |= TRACE_FLAG_IRQS_NOSUPPORT; - #endif -- ((pc & NMI_MASK ) ? TRACE_FLAG_NMI : 0) | -- ((pc & HARDIRQ_MASK) ? TRACE_FLAG_HARDIRQ : 0) | -- ((pc & SOFTIRQ_OFFSET) ? TRACE_FLAG_SOFTIRQ : 0) | -- (tif_need_resched() ? TRACE_FLAG_NEED_RESCHED : 0) | -- (test_preempt_need_resched() ? TRACE_FLAG_PREEMPT_RESCHED : 0); -+ -+ if (pc & NMI_MASK) -+ trace_flags |= TRACE_FLAG_NMI; -+ if (pc & HARDIRQ_MASK) -+ trace_flags |= TRACE_FLAG_HARDIRQ; -+ -+ if (pc & SOFTIRQ_OFFSET) -+ trace_flags |= TRACE_FLAG_SOFTIRQ; -+ -+ if (tif_need_resched()) -+ trace_flags |= TRACE_FLAG_NEED_RESCHED; -+ if (test_preempt_need_resched()) -+ trace_flags |= TRACE_FLAG_PREEMPT_RESCHED; -+ return (trace_flags << 16) | (pc & 0xff); -+} -+ -+unsigned int tracing_gen_ctx(void) -+{ -+ unsigned long irqflags; -+ -+#ifdef CONFIG_TRACE_IRQFLAGS_SUPPORT -+ local_save_flags(irqflags); -+#else -+ irqflags = 0; -+#endif -+ return tracing_gen_ctx_flags(irqflags); -+} -+ -+unsigned int tracing_gen_ctx_dec(void) -+{ -+ unsigned int trace_ctx; -+ -+ trace_ctx = tracing_gen_ctx(); -+ -+ /* -+ * Subtract one from the preeption counter if preemption is enabled, -+ * see trace_event_buffer_reserve()for details. -+ */ -+ if (IS_ENABLED(CONFIG_PREEMPTION)) -+ trace_ctx--; -+ return trace_ctx; - } --EXPORT_SYMBOL_GPL(tracing_generic_entry_update); - - struct ring_buffer_event * - trace_buffer_lock_reserve(struct trace_buffer *buffer, - int type, - unsigned long len, -- unsigned long flags, int pc) -+ unsigned int trace_ctx) - { -- return __trace_buffer_lock_reserve(buffer, type, len, flags, pc); -+ return __trace_buffer_lock_reserve(buffer, type, len, trace_ctx); - } - - DEFINE_PER_CPU(struct ring_buffer_event *, trace_buffered_event); -@@ -2722,7 +2749,7 @@ - trace_event_buffer_lock_reserve(struct trace_buffer **current_rb, - struct trace_event_file *trace_file, - int type, unsigned long len, -- unsigned long flags, int pc) -+ unsigned int trace_ctx) - { - struct ring_buffer_event *entry; - int val; -@@ -2735,7 +2762,7 @@ - /* Try to use the per cpu buffer first */ - val = this_cpu_inc_return(trace_buffered_event_cnt); - if ((len < (PAGE_SIZE - sizeof(*entry) - sizeof(entry->array[0]))) && val == 1) { -- trace_event_setup(entry, type, flags, pc); -+ trace_event_setup(entry, type, trace_ctx); - entry->array[0] = len; - return entry; - } -@@ -2743,7 +2770,7 @@ - } - - entry = __trace_buffer_lock_reserve(*current_rb, -- type, len, flags, pc); -+ type, len, trace_ctx); - /* - * If tracing is off, but we have triggers enabled - * we still need to look at the event data. Use the temp_buffer -@@ -2752,8 +2779,8 @@ - */ - if (!entry && trace_file->flags & EVENT_FILE_FL_TRIGGER_COND) { - *current_rb = temp_buffer; -- entry = __trace_buffer_lock_reserve(*current_rb, -- type, len, flags, pc); -+ entry = __trace_buffer_lock_reserve(*current_rb, type, len, -+ trace_ctx); - } - return entry; - } -@@ -2839,7 +2866,7 @@ - ftrace_exports(fbuffer->event, TRACE_EXPORT_EVENT); - event_trigger_unlock_commit_regs(fbuffer->trace_file, fbuffer->buffer, - fbuffer->event, fbuffer->entry, -- fbuffer->flags, fbuffer->pc, fbuffer->regs); -+ fbuffer->trace_ctx, fbuffer->regs); - } - EXPORT_SYMBOL_GPL(trace_event_buffer_commit); - -@@ -2855,7 +2882,7 @@ - void trace_buffer_unlock_commit_regs(struct trace_array *tr, - struct trace_buffer *buffer, - struct ring_buffer_event *event, -- unsigned long flags, int pc, -+ unsigned int trace_ctx, - struct pt_regs *regs) - { - __buffer_unlock_commit(buffer, event); -@@ -2866,8 +2893,8 @@ - * and mmiotrace, but that's ok if they lose a function or - * two. They are not that meaningful. - */ -- ftrace_trace_stack(tr, buffer, flags, regs ? 0 : STACK_SKIP, pc, regs); -- ftrace_trace_userstack(tr, buffer, flags, pc); -+ ftrace_trace_stack(tr, buffer, trace_ctx, regs ? 0 : STACK_SKIP, regs); -+ ftrace_trace_userstack(tr, buffer, trace_ctx); - } - - /* -@@ -2881,9 +2908,8 @@ - } - - void --trace_function(struct trace_array *tr, -- unsigned long ip, unsigned long parent_ip, unsigned long flags, -- int pc) -+trace_function(struct trace_array *tr, unsigned long ip, unsigned long -+ parent_ip, unsigned int trace_ctx) - { - struct trace_event_call *call = &event_function; - struct trace_buffer *buffer = tr->array_buffer.buffer; -@@ -2891,7 +2917,7 @@ - struct ftrace_entry *entry; - - event = __trace_buffer_lock_reserve(buffer, TRACE_FN, sizeof(*entry), -- flags, pc); -+ trace_ctx); - if (!event) - return; - entry = ring_buffer_event_data(event); -@@ -2925,8 +2951,8 @@ - static DEFINE_PER_CPU(int, ftrace_stack_reserve); - - static void __ftrace_trace_stack(struct trace_buffer *buffer, -- unsigned long flags, -- int skip, int pc, struct pt_regs *regs) -+ unsigned int trace_ctx, -+ int skip, struct pt_regs *regs) - { - struct trace_event_call *call = &event_kernel_stack; - struct ring_buffer_event *event; -@@ -2974,7 +3000,7 @@ - size = nr_entries * sizeof(unsigned long); - event = __trace_buffer_lock_reserve(buffer, TRACE_STACK, - (sizeof(*entry) - sizeof(entry->caller)) + size, -- flags, pc); -+ trace_ctx); - if (!event) - goto out; - entry = ring_buffer_event_data(event); -@@ -2995,22 +3021,22 @@ - - static inline void ftrace_trace_stack(struct trace_array *tr, - struct trace_buffer *buffer, -- unsigned long flags, -- int skip, int pc, struct pt_regs *regs) -+ unsigned int trace_ctx, -+ int skip, struct pt_regs *regs) - { - if (!(tr->trace_flags & TRACE_ITER_STACKTRACE)) - return; - -- __ftrace_trace_stack(buffer, flags, skip, pc, regs); -+ __ftrace_trace_stack(buffer, trace_ctx, skip, regs); - } - --void __trace_stack(struct trace_array *tr, unsigned long flags, int skip, -- int pc) -+void __trace_stack(struct trace_array *tr, unsigned int trace_ctx, -+ int skip) - { - struct trace_buffer *buffer = tr->array_buffer.buffer; - - if (rcu_is_watching()) { -- __ftrace_trace_stack(buffer, flags, skip, pc, NULL); -+ __ftrace_trace_stack(buffer, trace_ctx, skip, NULL); - return; - } - -@@ -3024,7 +3050,7 @@ - return; - - rcu_irq_enter_irqson(); -- __ftrace_trace_stack(buffer, flags, skip, pc, NULL); -+ __ftrace_trace_stack(buffer, trace_ctx, skip, NULL); - rcu_irq_exit_irqson(); - } - -@@ -3034,19 +3060,15 @@ - */ - void trace_dump_stack(int skip) - { -- unsigned long flags; -- - if (tracing_disabled || tracing_selftest_running) - return; - -- local_save_flags(flags); -- - #ifndef CONFIG_UNWINDER_ORC - /* Skip 1 to skip this function. */ - skip++; - #endif - __ftrace_trace_stack(global_trace.array_buffer.buffer, -- flags, skip, preempt_count(), NULL); -+ tracing_gen_ctx(), skip, NULL); - } - EXPORT_SYMBOL_GPL(trace_dump_stack); - -@@ -3055,7 +3077,7 @@ - - static void - ftrace_trace_userstack(struct trace_array *tr, -- struct trace_buffer *buffer, unsigned long flags, int pc) -+ struct trace_buffer *buffer, unsigned int trace_ctx) - { - struct trace_event_call *call = &event_user_stack; - struct ring_buffer_event *event; -@@ -3082,7 +3104,7 @@ - __this_cpu_inc(user_stack_count); - - event = __trace_buffer_lock_reserve(buffer, TRACE_USER_STACK, -- sizeof(*entry), flags, pc); -+ sizeof(*entry), trace_ctx); - if (!event) - goto out_drop_count; - entry = ring_buffer_event_data(event); -@@ -3102,7 +3124,7 @@ - #else /* CONFIG_USER_STACKTRACE_SUPPORT */ - static void ftrace_trace_userstack(struct trace_array *tr, - struct trace_buffer *buffer, -- unsigned long flags, int pc) -+ unsigned int trace_ctx) - { - } - #endif /* !CONFIG_USER_STACKTRACE_SUPPORT */ -@@ -3232,9 +3254,9 @@ - struct trace_buffer *buffer; - struct trace_array *tr = &global_trace; - struct bprint_entry *entry; -- unsigned long flags; -+ unsigned int trace_ctx; - char *tbuffer; -- int len = 0, size, pc; -+ int len = 0, size; - - if (unlikely(tracing_selftest_running || tracing_disabled)) - return 0; -@@ -3242,7 +3264,7 @@ - /* Don't pollute graph traces with trace_vprintk internals */ - pause_graph_tracing(); - -- pc = preempt_count(); -+ trace_ctx = tracing_gen_ctx(); - preempt_disable_notrace(); - - tbuffer = get_trace_buf(); -@@ -3256,12 +3278,11 @@ - if (len > TRACE_BUF_SIZE/sizeof(int) || len < 0) - goto out_put; - -- local_save_flags(flags); - size = sizeof(*entry) + sizeof(u32) * len; - buffer = tr->array_buffer.buffer; - ring_buffer_nest_start(buffer); - event = __trace_buffer_lock_reserve(buffer, TRACE_BPRINT, size, -- flags, pc); -+ trace_ctx); - if (!event) - goto out; - entry = ring_buffer_event_data(event); -@@ -3271,7 +3292,7 @@ - memcpy(entry->buf, tbuffer, sizeof(u32) * len); - if (!call_filter_check_discard(call, entry, buffer, event)) { - __buffer_unlock_commit(buffer, event); -- ftrace_trace_stack(tr, buffer, flags, 6, pc, NULL); -+ ftrace_trace_stack(tr, buffer, trace_ctx, 6, NULL); - } - - out: -@@ -3294,9 +3315,9 @@ - { - struct trace_event_call *call = &event_print; - struct ring_buffer_event *event; -- int len = 0, size, pc; -+ int len = 0, size; - struct print_entry *entry; -- unsigned long flags; -+ unsigned int trace_ctx; - char *tbuffer; - - if (tracing_disabled || tracing_selftest_running) -@@ -3305,7 +3326,7 @@ - /* Don't pollute graph traces with trace_vprintk internals */ - pause_graph_tracing(); - -- pc = preempt_count(); -+ trace_ctx = tracing_gen_ctx(); - preempt_disable_notrace(); - - -@@ -3317,11 +3338,10 @@ - - len = vscnprintf(tbuffer, TRACE_BUF_SIZE, fmt, args); - -- local_save_flags(flags); - size = sizeof(*entry) + len + 1; - ring_buffer_nest_start(buffer); - event = __trace_buffer_lock_reserve(buffer, TRACE_PRINT, size, -- flags, pc); -+ trace_ctx); - if (!event) - goto out; - entry = ring_buffer_event_data(event); -@@ -3330,7 +3350,7 @@ - memcpy(&entry->buf, tbuffer, len + 1); - if (!call_filter_check_discard(call, entry, buffer, event)) { - __buffer_unlock_commit(buffer, event); -- ftrace_trace_stack(&global_trace, buffer, flags, 6, pc, NULL); -+ ftrace_trace_stack(&global_trace, buffer, trace_ctx, 6, NULL); - } - - out: -@@ -6643,7 +6663,6 @@ - enum event_trigger_type tt = ETT_NONE; - struct trace_buffer *buffer; - struct print_entry *entry; -- unsigned long irq_flags; - ssize_t written; - int size; - int len; -@@ -6663,7 +6682,6 @@ - - BUILD_BUG_ON(TRACE_BUF_SIZE >= PAGE_SIZE); - -- local_save_flags(irq_flags); - size = sizeof(*entry) + cnt + 2; /* add '\0' and possible '\n' */ - - /* If less than "<faulted>", then make sure we can still add that */ -@@ -6672,7 +6690,7 @@ - - buffer = tr->array_buffer.buffer; - event = __trace_buffer_lock_reserve(buffer, TRACE_PRINT, size, -- irq_flags, preempt_count()); -+ tracing_gen_ctx()); - if (unlikely(!event)) - /* Ring buffer disabled, return as if not open for write */ - return -EBADF; -@@ -6724,7 +6742,6 @@ - struct ring_buffer_event *event; - struct trace_buffer *buffer; - struct raw_data_entry *entry; -- unsigned long irq_flags; - ssize_t written; - int size; - int len; -@@ -6746,14 +6763,13 @@ - - BUILD_BUG_ON(TRACE_BUF_SIZE >= PAGE_SIZE); - -- local_save_flags(irq_flags); - size = sizeof(*entry) + cnt; - if (cnt < FAULT_SIZE_ID) - size += FAULT_SIZE_ID - cnt; - - buffer = tr->array_buffer.buffer; - event = __trace_buffer_lock_reserve(buffer, TRACE_RAW_DATA, size, -- irq_flags, preempt_count()); -+ tracing_gen_ctx()); - if (!event) - /* Ring buffer disabled, return as if not open for write */ - return -EBADF; ---- a/kernel/trace/trace.h -+++ b/kernel/trace/trace.h -@@ -766,8 +766,7 @@ - trace_buffer_lock_reserve(struct trace_buffer *buffer, - int type, - unsigned long len, -- unsigned long flags, -- int pc); -+ unsigned int trace_ctx); - - struct trace_entry *tracing_get_trace_entry(struct trace_array *tr, - struct trace_array_cpu *data); -@@ -792,11 +791,11 @@ - void trace_function(struct trace_array *tr, - unsigned long ip, - unsigned long parent_ip, -- unsigned long flags, int pc); -+ unsigned int trace_ctx); - void trace_graph_function(struct trace_array *tr, - unsigned long ip, - unsigned long parent_ip, -- unsigned long flags, int pc); -+ unsigned int trace_ctx); - void trace_latency_header(struct seq_file *m); - void trace_default_header(struct seq_file *m); - void print_trace_header(struct seq_file *m, struct trace_iterator *iter); -@@ -864,11 +863,10 @@ - #endif - - #ifdef CONFIG_STACKTRACE --void __trace_stack(struct trace_array *tr, unsigned long flags, int skip, -- int pc); -+void __trace_stack(struct trace_array *tr, unsigned int trace_ctx, int skip); - #else --static inline void __trace_stack(struct trace_array *tr, unsigned long flags, -- int skip, int pc) -+static inline void __trace_stack(struct trace_array *tr, unsigned int trace_ctx, -+ int skip) - { - } - #endif /* CONFIG_STACKTRACE */ -@@ -1008,10 +1006,10 @@ - extern void graph_trace_close(struct trace_iterator *iter); - extern int __trace_graph_entry(struct trace_array *tr, - struct ftrace_graph_ent *trace, -- unsigned long flags, int pc); -+ unsigned int trace_ctx); - extern void __trace_graph_return(struct trace_array *tr, - struct ftrace_graph_ret *trace, -- unsigned long flags, int pc); -+ unsigned int trace_ctx); - - #ifdef CONFIG_DYNAMIC_FTRACE - extern struct ftrace_hash __rcu *ftrace_graph_hash; -@@ -1474,15 +1472,15 @@ - void trace_buffer_unlock_commit_regs(struct trace_array *tr, - struct trace_buffer *buffer, - struct ring_buffer_event *event, -- unsigned long flags, int pc, -+ unsigned int trcace_ctx, - struct pt_regs *regs); - - static inline void trace_buffer_unlock_commit(struct trace_array *tr, - struct trace_buffer *buffer, - struct ring_buffer_event *event, -- unsigned long flags, int pc) -+ unsigned int trace_ctx) - { -- trace_buffer_unlock_commit_regs(tr, buffer, event, flags, pc, NULL); -+ trace_buffer_unlock_commit_regs(tr, buffer, event, trace_ctx, NULL); - } - - DECLARE_PER_CPU(struct ring_buffer_event *, trace_buffered_event); -@@ -1543,8 +1541,7 @@ - * @buffer: The ring buffer that the event is being written to - * @event: The event meta data in the ring buffer - * @entry: The event itself -- * @irq_flags: The state of the interrupts at the start of the event -- * @pc: The state of the preempt count at the start of the event. -+ * @trace_ctx: The tracing context flags. - * - * This is a helper function to handle triggers that require data - * from the event itself. It also tests the event against filters and -@@ -1554,12 +1551,12 @@ - event_trigger_unlock_commit(struct trace_event_file *file, - struct trace_buffer *buffer, - struct ring_buffer_event *event, -- void *entry, unsigned long irq_flags, int pc) -+ void *entry, unsigned int trace_ctx) - { - enum event_trigger_type tt = ETT_NONE; - - if (!__event_trigger_test_discard(file, buffer, event, entry, &tt)) -- trace_buffer_unlock_commit(file->tr, buffer, event, irq_flags, pc); -+ trace_buffer_unlock_commit(file->tr, buffer, event, trace_ctx); - - if (tt) - event_triggers_post_call(file, tt); -@@ -1571,8 +1568,7 @@ - * @buffer: The ring buffer that the event is being written to - * @event: The event meta data in the ring buffer - * @entry: The event itself -- * @irq_flags: The state of the interrupts at the start of the event -- * @pc: The state of the preempt count at the start of the event. -+ * @trace_ctx: The tracing context flags. - * - * This is a helper function to handle triggers that require data - * from the event itself. It also tests the event against filters and -@@ -1585,14 +1581,14 @@ - event_trigger_unlock_commit_regs(struct trace_event_file *file, - struct trace_buffer *buffer, - struct ring_buffer_event *event, -- void *entry, unsigned long irq_flags, int pc, -+ void *entry, unsigned int trace_ctx, - struct pt_regs *regs) - { - enum event_trigger_type tt = ETT_NONE; - - if (!__event_trigger_test_discard(file, buffer, event, entry, &tt)) - trace_buffer_unlock_commit_regs(file->tr, buffer, event, -- irq_flags, pc, regs); -+ trace_ctx, regs); - - if (tt) - event_triggers_post_call(file, tt); ---- a/kernel/trace/trace_branch.c -+++ b/kernel/trace/trace_branch.c -@@ -37,7 +37,7 @@ - struct ring_buffer_event *event; - struct trace_branch *entry; - unsigned long flags; -- int pc; -+ unsigned int trace_ctx; - const char *p; - - if (current->trace_recursion & TRACE_BRANCH_BIT) -@@ -59,10 +59,10 @@ - if (atomic_read(&data->disabled)) - goto out; - -- pc = preempt_count(); -+ trace_ctx = tracing_gen_ctx_flags(flags); - buffer = tr->array_buffer.buffer; - event = trace_buffer_lock_reserve(buffer, TRACE_BRANCH, -- sizeof(*entry), flags, pc); -+ sizeof(*entry), trace_ctx); - if (!event) - goto out; - ---- a/kernel/trace/trace_event_perf.c -+++ b/kernel/trace/trace_event_perf.c -@@ -421,11 +421,8 @@ - void perf_trace_buf_update(void *record, u16 type) - { - struct trace_entry *entry = record; -- int pc = preempt_count(); -- unsigned long flags; - -- local_save_flags(flags); -- tracing_generic_entry_update(entry, type, flags, pc); -+ tracing_generic_entry_update(entry, type, tracing_gen_ctx()); - } - NOKPROBE_SYMBOL(perf_trace_buf_update); - ---- a/kernel/trace/trace_events.c -+++ b/kernel/trace/trace_events.c -@@ -258,22 +258,19 @@ - trace_event_ignore_this_pid(trace_file)) - return NULL; - -- local_save_flags(fbuffer->flags); -- fbuffer->pc = preempt_count(); - /* - * If CONFIG_PREEMPTION is enabled, then the tracepoint itself disables - * preemption (adding one to the preempt_count). Since we are - * interested in the preempt_count at the time the tracepoint was - * hit, we need to subtract one to offset the increment. - */ -- if (IS_ENABLED(CONFIG_PREEMPTION)) -- fbuffer->pc--; -+ fbuffer->trace_ctx = tracing_gen_ctx_dec(); - fbuffer->trace_file = trace_file; - - fbuffer->event = - trace_event_buffer_lock_reserve(&fbuffer->buffer, trace_file, - event_call->event.type, len, -- fbuffer->flags, fbuffer->pc); -+ fbuffer->trace_ctx); - if (!fbuffer->event) - return NULL; - -@@ -3679,12 +3676,11 @@ - struct trace_buffer *buffer; - struct ring_buffer_event *event; - struct ftrace_entry *entry; -- unsigned long flags; -+ unsigned int trace_ctx; - long disabled; - int cpu; -- int pc; - -- pc = preempt_count(); -+ trace_ctx = tracing_gen_ctx(); - preempt_disable_notrace(); - cpu = raw_smp_processor_id(); - disabled = atomic_inc_return(&per_cpu(ftrace_test_event_disable, cpu)); -@@ -3692,11 +3688,9 @@ - if (disabled != 1) - goto out; - -- local_save_flags(flags); -- - event = trace_event_buffer_lock_reserve(&buffer, &event_trace_file, - TRACE_FN, sizeof(*entry), -- flags, pc); -+ trace_ctx); - if (!event) - goto out; - entry = ring_buffer_event_data(event); -@@ -3704,7 +3698,7 @@ - entry->parent_ip = parent_ip; - - event_trigger_unlock_commit(&event_trace_file, buffer, event, -- entry, flags, pc); -+ entry, trace_ctx); - out: - atomic_dec(&per_cpu(ftrace_test_event_disable, cpu)); - preempt_enable_notrace(); ---- a/kernel/trace/trace_events_inject.c -+++ b/kernel/trace/trace_events_inject.c -@@ -192,7 +192,6 @@ - static int parse_entry(char *str, struct trace_event_call *call, void **pentry) - { - struct ftrace_event_field *field; -- unsigned long irq_flags; - void *entry = NULL; - int entry_size; - u64 val = 0; -@@ -203,9 +202,8 @@ - if (!entry) - return -ENOMEM; - -- local_save_flags(irq_flags); -- tracing_generic_entry_update(entry, call->event.type, irq_flags, -- preempt_count()); -+ tracing_generic_entry_update(entry, call->event.type, -+ tracing_gen_ctx()); - - while ((len = parse_field(str, call, &field, &val)) > 0) { - if (is_function_field(field)) ---- a/kernel/trace/trace_functions.c -+++ b/kernel/trace/trace_functions.c -@@ -133,15 +133,14 @@ - { - struct trace_array *tr = op->private; - struct trace_array_cpu *data; -- unsigned long flags; -+ unsigned int trace_ctx; - int bit; - int cpu; -- int pc; - - if (unlikely(!tr->function_enabled)) - return; - -- pc = preempt_count(); -+ trace_ctx = tracing_gen_ctx(); - preempt_disable_notrace(); - - bit = trace_test_and_set_recursion(TRACE_FTRACE_START, TRACE_FTRACE_MAX); -@@ -150,10 +149,9 @@ - - cpu = smp_processor_id(); - data = per_cpu_ptr(tr->array_buffer.data, cpu); -- if (!atomic_read(&data->disabled)) { -- local_save_flags(flags); -- trace_function(tr, ip, parent_ip, flags, pc); -- } -+ if (!atomic_read(&data->disabled)) -+ trace_function(tr, ip, parent_ip, trace_ctx); -+ - trace_clear_recursion(bit); - - out: -@@ -187,7 +185,7 @@ - unsigned long flags; - long disabled; - int cpu; -- int pc; -+ unsigned int trace_ctx; - - if (unlikely(!tr->function_enabled)) - return; -@@ -202,9 +200,9 @@ - disabled = atomic_inc_return(&data->disabled); - - if (likely(disabled == 1)) { -- pc = preempt_count(); -- trace_function(tr, ip, parent_ip, flags, pc); -- __trace_stack(tr, flags, STACK_SKIP, pc); -+ trace_ctx = tracing_gen_ctx_flags(flags); -+ trace_function(tr, ip, parent_ip, trace_ctx); -+ __trace_stack(tr, trace_ctx, STACK_SKIP); - } - - atomic_dec(&data->disabled); -@@ -407,13 +405,11 @@ - - static __always_inline void trace_stack(struct trace_array *tr) - { -- unsigned long flags; -- int pc; -+ unsigned int trace_ctx; - -- local_save_flags(flags); -- pc = preempt_count(); -+ trace_ctx = tracing_gen_ctx(); - -- __trace_stack(tr, flags, FTRACE_STACK_SKIP, pc); -+ __trace_stack(tr, trace_ctx, FTRACE_STACK_SKIP); - } - - static void ---- a/kernel/trace/trace_functions_graph.c -+++ b/kernel/trace/trace_functions_graph.c -@@ -96,8 +96,7 @@ - - int __trace_graph_entry(struct trace_array *tr, - struct ftrace_graph_ent *trace, -- unsigned long flags, -- int pc) -+ unsigned int trace_ctx) - { - struct trace_event_call *call = &event_funcgraph_entry; - struct ring_buffer_event *event; -@@ -105,7 +104,7 @@ - struct ftrace_graph_ent_entry *entry; - - event = trace_buffer_lock_reserve(buffer, TRACE_GRAPH_ENT, -- sizeof(*entry), flags, pc); -+ sizeof(*entry), trace_ctx); - if (!event) - return 0; - entry = ring_buffer_event_data(event); -@@ -129,10 +128,10 @@ - struct trace_array *tr = graph_array; - struct trace_array_cpu *data; - unsigned long flags; -+ unsigned int trace_ctx; - long disabled; - int ret; - int cpu; -- int pc; - - if (trace_recursion_test(TRACE_GRAPH_NOTRACE_BIT)) - return 0; -@@ -174,8 +173,8 @@ - data = per_cpu_ptr(tr->array_buffer.data, cpu); - disabled = atomic_inc_return(&data->disabled); - if (likely(disabled == 1)) { -- pc = preempt_count(); -- ret = __trace_graph_entry(tr, trace, flags, pc); -+ trace_ctx = tracing_gen_ctx_flags(flags); -+ ret = __trace_graph_entry(tr, trace, trace_ctx); - } else { - ret = 0; - } -@@ -188,7 +187,7 @@ - - static void - __trace_graph_function(struct trace_array *tr, -- unsigned long ip, unsigned long flags, int pc) -+ unsigned long ip, unsigned int trace_ctx) - { - u64 time = trace_clock_local(); - struct ftrace_graph_ent ent = { -@@ -202,22 +201,21 @@ - .rettime = time, - }; - -- __trace_graph_entry(tr, &ent, flags, pc); -- __trace_graph_return(tr, &ret, flags, pc); -+ __trace_graph_entry(tr, &ent, trace_ctx); -+ __trace_graph_return(tr, &ret, trace_ctx); - } - - void - trace_graph_function(struct trace_array *tr, - unsigned long ip, unsigned long parent_ip, -- unsigned long flags, int pc) -+ unsigned int trace_ctx) - { -- __trace_graph_function(tr, ip, flags, pc); -+ __trace_graph_function(tr, ip, trace_ctx); - } - - void __trace_graph_return(struct trace_array *tr, - struct ftrace_graph_ret *trace, -- unsigned long flags, -- int pc) -+ unsigned int trace_ctx) - { - struct trace_event_call *call = &event_funcgraph_exit; - struct ring_buffer_event *event; -@@ -225,7 +223,7 @@ - struct ftrace_graph_ret_entry *entry; - - event = trace_buffer_lock_reserve(buffer, TRACE_GRAPH_RET, -- sizeof(*entry), flags, pc); -+ sizeof(*entry), trace_ctx); - if (!event) - return; - entry = ring_buffer_event_data(event); -@@ -239,9 +237,9 @@ - struct trace_array *tr = graph_array; - struct trace_array_cpu *data; - unsigned long flags; -+ unsigned int trace_ctx; - long disabled; - int cpu; -- int pc; - - ftrace_graph_addr_finish(trace); - -@@ -255,8 +253,8 @@ - data = per_cpu_ptr(tr->array_buffer.data, cpu); - disabled = atomic_inc_return(&data->disabled); - if (likely(disabled == 1)) { -- pc = preempt_count(); -- __trace_graph_return(tr, trace, flags, pc); -+ trace_ctx = tracing_gen_ctx_flags(flags); -+ __trace_graph_return(tr, trace, trace_ctx); - } - atomic_dec(&data->disabled); - local_irq_restore(flags); ---- a/kernel/trace/trace_hwlat.c -+++ b/kernel/trace/trace_hwlat.c -@@ -108,14 +108,9 @@ - struct trace_buffer *buffer = tr->array_buffer.buffer; - struct ring_buffer_event *event; - struct hwlat_entry *entry; -- unsigned long flags; -- int pc; -- -- pc = preempt_count(); -- local_save_flags(flags); - - event = trace_buffer_lock_reserve(buffer, TRACE_HWLAT, sizeof(*entry), -- flags, pc); -+ tracing_gen_ctx()); - if (!event) - return; - entry = ring_buffer_event_data(event); ---- a/kernel/trace/trace_irqsoff.c -+++ b/kernel/trace/trace_irqsoff.c -@@ -143,11 +143,14 @@ - struct trace_array *tr = irqsoff_trace; - struct trace_array_cpu *data; - unsigned long flags; -+ unsigned int trace_ctx; - - if (!func_prolog_dec(tr, &data, &flags)) - return; - -- trace_function(tr, ip, parent_ip, flags, preempt_count()); -+ trace_ctx = tracing_gen_ctx_flags(flags); -+ -+ trace_function(tr, ip, parent_ip, trace_ctx); - - atomic_dec(&data->disabled); - } -@@ -177,8 +180,8 @@ - struct trace_array *tr = irqsoff_trace; - struct trace_array_cpu *data; - unsigned long flags; -+ unsigned int trace_ctx; - int ret; -- int pc; - - if (ftrace_graph_ignore_func(trace)) - return 0; -@@ -195,8 +198,8 @@ - if (!func_prolog_dec(tr, &data, &flags)) - return 0; - -- pc = preempt_count(); -- ret = __trace_graph_entry(tr, trace, flags, pc); -+ trace_ctx = tracing_gen_ctx_flags(flags); -+ ret = __trace_graph_entry(tr, trace, trace_ctx); - atomic_dec(&data->disabled); - - return ret; -@@ -207,15 +210,15 @@ - struct trace_array *tr = irqsoff_trace; - struct trace_array_cpu *data; - unsigned long flags; -- int pc; -+ unsigned int trace_ctx; - - ftrace_graph_addr_finish(trace); - - if (!func_prolog_dec(tr, &data, &flags)) - return; - -- pc = preempt_count(); -- __trace_graph_return(tr, trace, flags, pc); -+ trace_ctx = tracing_gen_ctx_flags(flags); -+ __trace_graph_return(tr, trace, trace_ctx); - atomic_dec(&data->disabled); - } - -@@ -267,12 +270,12 @@ - static void - __trace_function(struct trace_array *tr, - unsigned long ip, unsigned long parent_ip, -- unsigned long flags, int pc) -+ unsigned int trace_ctx) - { - if (is_graph(tr)) -- trace_graph_function(tr, ip, parent_ip, flags, pc); -+ trace_graph_function(tr, ip, parent_ip, trace_ctx); - else -- trace_function(tr, ip, parent_ip, flags, pc); -+ trace_function(tr, ip, parent_ip, trace_ctx); - } - - #else -@@ -322,15 +325,13 @@ - { - u64 T0, T1, delta; - unsigned long flags; -- int pc; -+ unsigned int trace_ctx; - - T0 = data->preempt_timestamp; - T1 = ftrace_now(cpu); - delta = T1-T0; - -- local_save_flags(flags); -- -- pc = preempt_count(); -+ trace_ctx = tracing_gen_ctx(); - - if (!report_latency(tr, delta)) - goto out; -@@ -341,9 +342,9 @@ - if (!report_latency(tr, delta)) - goto out_unlock; - -- __trace_function(tr, CALLER_ADDR0, parent_ip, flags, pc); -+ __trace_function(tr, CALLER_ADDR0, parent_ip, trace_ctx); - /* Skip 5 functions to get to the irq/preempt enable function */ -- __trace_stack(tr, flags, 5, pc); -+ __trace_stack(tr, trace_ctx, 5); - - if (data->critical_sequence != max_sequence) - goto out_unlock; -@@ -363,16 +364,15 @@ - out: - data->critical_sequence = max_sequence; - data->preempt_timestamp = ftrace_now(cpu); -- __trace_function(tr, CALLER_ADDR0, parent_ip, flags, pc); -+ __trace_function(tr, CALLER_ADDR0, parent_ip, trace_ctx); - } - - static nokprobe_inline void --start_critical_timing(unsigned long ip, unsigned long parent_ip, int pc) -+start_critical_timing(unsigned long ip, unsigned long parent_ip) - { - int cpu; - struct trace_array *tr = irqsoff_trace; - struct trace_array_cpu *data; -- unsigned long flags; - - if (!tracer_enabled || !tracing_is_enabled()) - return; -@@ -393,9 +393,7 @@ - data->preempt_timestamp = ftrace_now(cpu); - data->critical_start = parent_ip ? : ip; - -- local_save_flags(flags); -- -- __trace_function(tr, ip, parent_ip, flags, pc); -+ __trace_function(tr, ip, parent_ip, tracing_gen_ctx()); - - per_cpu(tracing_cpu, cpu) = 1; - -@@ -403,12 +401,12 @@ - } - - static nokprobe_inline void --stop_critical_timing(unsigned long ip, unsigned long parent_ip, int pc) -+stop_critical_timing(unsigned long ip, unsigned long parent_ip) - { - int cpu; - struct trace_array *tr = irqsoff_trace; - struct trace_array_cpu *data; -- unsigned long flags; -+ unsigned int trace_ctx; - - cpu = raw_smp_processor_id(); - /* Always clear the tracing cpu on stopping the trace */ -@@ -428,8 +426,8 @@ - - atomic_inc(&data->disabled); - -- local_save_flags(flags); -- __trace_function(tr, ip, parent_ip, flags, pc); -+ trace_ctx = tracing_gen_ctx(); -+ __trace_function(tr, ip, parent_ip, trace_ctx); - check_critical_timing(tr, data, parent_ip ? : ip, cpu); - data->critical_start = 0; - atomic_dec(&data->disabled); -@@ -438,20 +436,16 @@ - /* start and stop critical timings used to for stoppage (in idle) */ - void start_critical_timings(void) - { -- int pc = preempt_count(); -- -- if (preempt_trace(pc) || irq_trace()) -- start_critical_timing(CALLER_ADDR0, CALLER_ADDR1, pc); -+ if (preempt_trace(preempt_count()) || irq_trace()) -+ start_critical_timing(CALLER_ADDR0, CALLER_ADDR1); - } - EXPORT_SYMBOL_GPL(start_critical_timings); - NOKPROBE_SYMBOL(start_critical_timings); - - void stop_critical_timings(void) - { -- int pc = preempt_count(); -- -- if (preempt_trace(pc) || irq_trace()) -- stop_critical_timing(CALLER_ADDR0, CALLER_ADDR1, pc); -+ if (preempt_trace(preempt_count()) || irq_trace()) -+ stop_critical_timing(CALLER_ADDR0, CALLER_ADDR1); - } - EXPORT_SYMBOL_GPL(stop_critical_timings); - NOKPROBE_SYMBOL(stop_critical_timings); -@@ -613,19 +607,15 @@ - */ - void tracer_hardirqs_on(unsigned long a0, unsigned long a1) - { -- unsigned int pc = preempt_count(); -- -- if (!preempt_trace(pc) && irq_trace()) -- stop_critical_timing(a0, a1, pc); -+ if (!preempt_trace(preempt_count()) && irq_trace()) -+ stop_critical_timing(a0, a1); - } - NOKPROBE_SYMBOL(tracer_hardirqs_on); - - void tracer_hardirqs_off(unsigned long a0, unsigned long a1) - { -- unsigned int pc = preempt_count(); -- -- if (!preempt_trace(pc) && irq_trace()) -- start_critical_timing(a0, a1, pc); -+ if (!preempt_trace(preempt_count()) && irq_trace()) -+ start_critical_timing(a0, a1); - } - NOKPROBE_SYMBOL(tracer_hardirqs_off); - -@@ -665,18 +655,14 @@ - #ifdef CONFIG_PREEMPT_TRACER - void tracer_preempt_on(unsigned long a0, unsigned long a1) - { -- int pc = preempt_count(); -- -- if (preempt_trace(pc) && !irq_trace()) -- stop_critical_timing(a0, a1, pc); -+ if (preempt_trace(preempt_count()) && !irq_trace()) -+ stop_critical_timing(a0, a1); - } - - void tracer_preempt_off(unsigned long a0, unsigned long a1) - { -- int pc = preempt_count(); -- -- if (preempt_trace(pc) && !irq_trace()) -- start_critical_timing(a0, a1, pc); -+ if (preempt_trace(preempt_count()) && !irq_trace()) -+ start_critical_timing(a0, a1); - } - - static int preemptoff_tracer_init(struct trace_array *tr) ---- a/kernel/trace/trace_kprobe.c -+++ b/kernel/trace/trace_kprobe.c -@@ -1386,8 +1386,7 @@ - if (trace_trigger_soft_disabled(trace_file)) - return; - -- local_save_flags(fbuffer.flags); -- fbuffer.pc = preempt_count(); -+ fbuffer.trace_ctx = tracing_gen_ctx(); - fbuffer.trace_file = trace_file; - - dsize = __get_data_size(&tk->tp, regs); -@@ -1396,7 +1395,7 @@ - trace_event_buffer_lock_reserve(&fbuffer.buffer, trace_file, - call->event.type, - sizeof(*entry) + tk->tp.size + dsize, -- fbuffer.flags, fbuffer.pc); -+ fbuffer.trace_ctx); - if (!fbuffer.event) - return; - -@@ -1434,8 +1433,7 @@ - if (trace_trigger_soft_disabled(trace_file)) - return; - -- local_save_flags(fbuffer.flags); -- fbuffer.pc = preempt_count(); -+ fbuffer.trace_ctx = tracing_gen_ctx(); - fbuffer.trace_file = trace_file; - - dsize = __get_data_size(&tk->tp, regs); -@@ -1443,7 +1441,7 @@ - trace_event_buffer_lock_reserve(&fbuffer.buffer, trace_file, - call->event.type, - sizeof(*entry) + tk->tp.size + dsize, -- fbuffer.flags, fbuffer.pc); -+ fbuffer.trace_ctx); - if (!fbuffer.event) - return; - ---- a/kernel/trace/trace_mmiotrace.c -+++ b/kernel/trace/trace_mmiotrace.c -@@ -300,10 +300,11 @@ - struct trace_buffer *buffer = tr->array_buffer.buffer; - struct ring_buffer_event *event; - struct trace_mmiotrace_rw *entry; -- int pc = preempt_count(); -+ unsigned int trace_ctx; - -+ trace_ctx = tracing_gen_ctx_flags(0); - event = trace_buffer_lock_reserve(buffer, TRACE_MMIO_RW, -- sizeof(*entry), 0, pc); -+ sizeof(*entry), trace_ctx); - if (!event) { - atomic_inc(&dropped_count); - return; -@@ -312,7 +313,7 @@ - entry->rw = *rw; - - if (!call_filter_check_discard(call, entry, buffer, event)) -- trace_buffer_unlock_commit(tr, buffer, event, 0, pc); -+ trace_buffer_unlock_commit(tr, buffer, event, trace_ctx); - } - - void mmio_trace_rw(struct mmiotrace_rw *rw) -@@ -330,10 +331,11 @@ - struct trace_buffer *buffer = tr->array_buffer.buffer; - struct ring_buffer_event *event; - struct trace_mmiotrace_map *entry; -- int pc = preempt_count(); -+ unsigned int trace_ctx; - -+ trace_ctx = tracing_gen_ctx_flags(0); - event = trace_buffer_lock_reserve(buffer, TRACE_MMIO_MAP, -- sizeof(*entry), 0, pc); -+ sizeof(*entry), trace_ctx); - if (!event) { - atomic_inc(&dropped_count); - return; -@@ -342,7 +344,7 @@ - entry->map = *map; - - if (!call_filter_check_discard(call, entry, buffer, event)) -- trace_buffer_unlock_commit(tr, buffer, event, 0, pc); -+ trace_buffer_unlock_commit(tr, buffer, event, trace_ctx); - } - - void mmio_trace_mapping(struct mmiotrace_map *map) ---- a/kernel/trace/trace_sched_wakeup.c -+++ b/kernel/trace/trace_sched_wakeup.c -@@ -67,7 +67,7 @@ - static int - func_prolog_preempt_disable(struct trace_array *tr, - struct trace_array_cpu **data, -- int *pc) -+ unsigned int *trace_ctx) - { - long disabled; - int cpu; -@@ -75,7 +75,7 @@ - if (likely(!wakeup_task)) - return 0; - -- *pc = preempt_count(); -+ *trace_ctx = tracing_gen_ctx(); - preempt_disable_notrace(); - - cpu = raw_smp_processor_id(); -@@ -116,8 +116,8 @@ - { - struct trace_array *tr = wakeup_trace; - struct trace_array_cpu *data; -- unsigned long flags; -- int pc, ret = 0; -+ unsigned int trace_ctx; -+ int ret = 0; - - if (ftrace_graph_ignore_func(trace)) - return 0; -@@ -131,11 +131,10 @@ - if (ftrace_graph_notrace_addr(trace->func)) - return 1; - -- if (!func_prolog_preempt_disable(tr, &data, &pc)) -+ if (!func_prolog_preempt_disable(tr, &data, &trace_ctx)) - return 0; - -- local_save_flags(flags); -- ret = __trace_graph_entry(tr, trace, flags, pc); -+ ret = __trace_graph_entry(tr, trace, trace_ctx); - atomic_dec(&data->disabled); - preempt_enable_notrace(); - -@@ -146,16 +145,14 @@ - { - struct trace_array *tr = wakeup_trace; - struct trace_array_cpu *data; -- unsigned long flags; -- int pc; -+ unsigned int trace_ctx; - - ftrace_graph_addr_finish(trace); - -- if (!func_prolog_preempt_disable(tr, &data, &pc)) -+ if (!func_prolog_preempt_disable(tr, &data, &trace_ctx)) - return; - -- local_save_flags(flags); -- __trace_graph_return(tr, trace, flags, pc); -+ __trace_graph_return(tr, trace, trace_ctx); - atomic_dec(&data->disabled); - - preempt_enable_notrace(); -@@ -217,13 +214,13 @@ - struct trace_array *tr = wakeup_trace; - struct trace_array_cpu *data; - unsigned long flags; -- int pc; -+ unsigned int trace_ctx; - -- if (!func_prolog_preempt_disable(tr, &data, &pc)) -+ if (!func_prolog_preempt_disable(tr, &data, &trace_ctx)) - return; - - local_irq_save(flags); -- trace_function(tr, ip, parent_ip, flags, pc); -+ trace_function(tr, ip, parent_ip, trace_ctx); - local_irq_restore(flags); - - atomic_dec(&data->disabled); -@@ -303,12 +300,12 @@ - static void - __trace_function(struct trace_array *tr, - unsigned long ip, unsigned long parent_ip, -- unsigned long flags, int pc) -+ unsigned int trace_ctx) - { - if (is_graph(tr)) -- trace_graph_function(tr, ip, parent_ip, flags, pc); -+ trace_graph_function(tr, ip, parent_ip, trace_ctx); - else -- trace_function(tr, ip, parent_ip, flags, pc); -+ trace_function(tr, ip, parent_ip, trace_ctx); - } - - static int wakeup_flag_changed(struct trace_array *tr, u32 mask, int set) -@@ -375,7 +372,7 @@ - tracing_sched_switch_trace(struct trace_array *tr, - struct task_struct *prev, - struct task_struct *next, -- unsigned long flags, int pc) -+ unsigned int trace_ctx) - { - struct trace_event_call *call = &event_context_switch; - struct trace_buffer *buffer = tr->array_buffer.buffer; -@@ -383,7 +380,7 @@ - struct ctx_switch_entry *entry; - - event = trace_buffer_lock_reserve(buffer, TRACE_CTX, -- sizeof(*entry), flags, pc); -+ sizeof(*entry), trace_ctx); - if (!event) - return; - entry = ring_buffer_event_data(event); -@@ -396,14 +393,14 @@ - entry->next_cpu = task_cpu(next); - - if (!call_filter_check_discard(call, entry, buffer, event)) -- trace_buffer_unlock_commit(tr, buffer, event, flags, pc); -+ trace_buffer_unlock_commit(tr, buffer, event, trace_ctx); - } - - static void - tracing_sched_wakeup_trace(struct trace_array *tr, - struct task_struct *wakee, - struct task_struct *curr, -- unsigned long flags, int pc) -+ unsigned int trace_ctx) - { - struct trace_event_call *call = &event_wakeup; - struct ring_buffer_event *event; -@@ -411,7 +408,7 @@ - struct trace_buffer *buffer = tr->array_buffer.buffer; - - event = trace_buffer_lock_reserve(buffer, TRACE_WAKE, -- sizeof(*entry), flags, pc); -+ sizeof(*entry), trace_ctx); - if (!event) - return; - entry = ring_buffer_event_data(event); -@@ -424,7 +421,7 @@ - entry->next_cpu = task_cpu(wakee); - - if (!call_filter_check_discard(call, entry, buffer, event)) -- trace_buffer_unlock_commit(tr, buffer, event, flags, pc); -+ trace_buffer_unlock_commit(tr, buffer, event, trace_ctx); - } - - static void notrace -@@ -436,7 +433,7 @@ - unsigned long flags; - long disabled; - int cpu; -- int pc; -+ unsigned int trace_ctx; - - tracing_record_cmdline(prev); - -@@ -455,8 +452,6 @@ - if (next != wakeup_task) - return; - -- pc = preempt_count(); -- - /* disable local data, not wakeup_cpu data */ - cpu = raw_smp_processor_id(); - disabled = atomic_inc_return(&per_cpu_ptr(wakeup_trace->array_buffer.data, cpu)->disabled); -@@ -464,6 +459,8 @@ - goto out; - - local_irq_save(flags); -+ trace_ctx = tracing_gen_ctx_flags(flags); -+ - arch_spin_lock(&wakeup_lock); - - /* We could race with grabbing wakeup_lock */ -@@ -473,9 +470,9 @@ - /* The task we are waiting for is waking up */ - data = per_cpu_ptr(wakeup_trace->array_buffer.data, wakeup_cpu); - -- __trace_function(wakeup_trace, CALLER_ADDR0, CALLER_ADDR1, flags, pc); -- tracing_sched_switch_trace(wakeup_trace, prev, next, flags, pc); -- __trace_stack(wakeup_trace, flags, 0, pc); -+ __trace_function(wakeup_trace, CALLER_ADDR0, CALLER_ADDR1, trace_ctx); -+ tracing_sched_switch_trace(wakeup_trace, prev, next, trace_ctx); -+ __trace_stack(wakeup_trace, trace_ctx, 0); - - T0 = data->preempt_timestamp; - T1 = ftrace_now(cpu); -@@ -527,9 +524,8 @@ - { - struct trace_array_cpu *data; - int cpu = smp_processor_id(); -- unsigned long flags; - long disabled; -- int pc; -+ unsigned int trace_ctx; - - if (likely(!tracer_enabled)) - return; -@@ -550,11 +546,12 @@ - (!dl_task(p) && (p->prio >= wakeup_prio || p->prio >= current->prio))) - return; - -- pc = preempt_count(); - disabled = atomic_inc_return(&per_cpu_ptr(wakeup_trace->array_buffer.data, cpu)->disabled); - if (unlikely(disabled != 1)) - goto out; - -+ trace_ctx = tracing_gen_ctx(); -+ - /* interrupts should be off from try_to_wake_up */ - arch_spin_lock(&wakeup_lock); - -@@ -581,19 +578,17 @@ - - wakeup_task = get_task_struct(p); - -- local_save_flags(flags); -- - data = per_cpu_ptr(wakeup_trace->array_buffer.data, wakeup_cpu); - data->preempt_timestamp = ftrace_now(cpu); -- tracing_sched_wakeup_trace(wakeup_trace, p, current, flags, pc); -- __trace_stack(wakeup_trace, flags, 0, pc); -+ tracing_sched_wakeup_trace(wakeup_trace, p, current, trace_ctx); -+ __trace_stack(wakeup_trace, trace_ctx, 0); - - /* - * We must be careful in using CALLER_ADDR2. But since wake_up - * is not called by an assembly function (where as schedule is) - * it should be safe to use it here. - */ -- __trace_function(wakeup_trace, CALLER_ADDR1, CALLER_ADDR2, flags, pc); -+ __trace_function(wakeup_trace, CALLER_ADDR1, CALLER_ADDR2, trace_ctx); - - out_locked: - arch_spin_unlock(&wakeup_lock); ---- a/kernel/trace/trace_syscalls.c -+++ b/kernel/trace/trace_syscalls.c -@@ -298,9 +298,8 @@ - struct syscall_metadata *sys_data; - struct ring_buffer_event *event; - struct trace_buffer *buffer; -- unsigned long irq_flags; -+ unsigned int trace_ctx; - unsigned long args[6]; -- int pc; - int syscall_nr; - int size; - -@@ -322,12 +321,11 @@ - - size = sizeof(*entry) + sizeof(unsigned long) * sys_data->nb_args; - -- local_save_flags(irq_flags); -- pc = preempt_count(); -+ trace_ctx = tracing_gen_ctx(); - - buffer = tr->array_buffer.buffer; - event = trace_buffer_lock_reserve(buffer, -- sys_data->enter_event->event.type, size, irq_flags, pc); -+ sys_data->enter_event->event.type, size, trace_ctx); - if (!event) - return; - -@@ -337,7 +335,7 @@ - memcpy(entry->args, args, sizeof(unsigned long) * sys_data->nb_args); - - event_trigger_unlock_commit(trace_file, buffer, event, entry, -- irq_flags, pc); -+ trace_ctx); - } - - static void ftrace_syscall_exit(void *data, struct pt_regs *regs, long ret) -@@ -348,8 +346,7 @@ - struct syscall_metadata *sys_data; - struct ring_buffer_event *event; - struct trace_buffer *buffer; -- unsigned long irq_flags; -- int pc; -+ unsigned int trace_ctx; - int syscall_nr; - - syscall_nr = trace_get_syscall_nr(current, regs); -@@ -368,13 +365,12 @@ - if (!sys_data) - return; - -- local_save_flags(irq_flags); -- pc = preempt_count(); -+ trace_ctx = tracing_gen_ctx(); - - buffer = tr->array_buffer.buffer; - event = trace_buffer_lock_reserve(buffer, - sys_data->exit_event->event.type, sizeof(*entry), -- irq_flags, pc); -+ trace_ctx); - if (!event) - return; - -@@ -383,7 +379,7 @@ - entry->ret = syscall_get_return_value(current, regs); - - event_trigger_unlock_commit(trace_file, buffer, event, entry, -- irq_flags, pc); -+ trace_ctx); - } - - static int reg_event_syscall_enter(struct trace_event_file *file, ---- a/kernel/trace/trace_uprobe.c -+++ b/kernel/trace/trace_uprobe.c -@@ -961,7 +961,7 @@ - esize = SIZEOF_TRACE_ENTRY(is_ret_probe(tu)); - size = esize + tu->tp.size + dsize; - event = trace_event_buffer_lock_reserve(&buffer, trace_file, -- call->event.type, size, 0, 0); -+ call->event.type, size, 0); - if (!event) - return; - -@@ -977,7 +977,7 @@ - - memcpy(data, ucb->buf, tu->tp.size + dsize); - -- event_trigger_unlock_commit(trace_file, buffer, event, entry, 0, 0); -+ event_trigger_unlock_commit(trace_file, buffer, event, entry, 0); - } - - /* uprobe handler */ diff --git a/debian/patches-rt/0078-tracing-Inline-tracing_gen_ctx_flags.patch b/debian/patches-rt/0078-tracing-Inline-tracing_gen_ctx_flags.patch deleted file mode 100644 index 336222076..000000000 --- a/debian/patches-rt/0078-tracing-Inline-tracing_gen_ctx_flags.patch +++ /dev/null @@ -1,184 +0,0 @@ -From 5785fbee2a347916d831f480bad77ea18b1345c2 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Wed, 3 Feb 2021 11:05:24 -0500 -Subject: [PATCH 078/296] tracing: Inline tracing_gen_ctx_flags() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Inline tracing_gen_ctx_flags(). This allows to have one ifdef -CONFIG_TRACE_IRQFLAGS_SUPPORT. - -This requires to move `trace_flag_type' so tracing_gen_ctx_flags() can -use it. - -Link: https://lkml.kernel.org/r/20210125194511.3924915-3-bigeasy@linutronix.de - -Suggested-by: Steven Rostedt <rostedt@goodmis.org> -Link: https://lkml.kernel.org/r/20210125140323.6b1ff20c@gandalf.local.home -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/trace_events.h | 54 ++++++++++++++++++++++++++++++++++-- - kernel/trace/trace.c | 38 ++----------------------- - kernel/trace/trace.h | 19 ------------- - 3 files changed, 53 insertions(+), 58 deletions(-) - -diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h -index 091250b0895a..67ae708de40d 100644 ---- a/include/linux/trace_events.h -+++ b/include/linux/trace_events.h -@@ -160,9 +160,57 @@ static inline void tracing_generic_entry_update(struct trace_entry *entry, - entry->flags = trace_ctx >> 16; - } - --unsigned int tracing_gen_ctx_flags(unsigned long irqflags); --unsigned int tracing_gen_ctx(void); --unsigned int tracing_gen_ctx_dec(void); -+unsigned int tracing_gen_ctx_irq_test(unsigned int irqs_status); -+ -+enum trace_flag_type { -+ TRACE_FLAG_IRQS_OFF = 0x01, -+ TRACE_FLAG_IRQS_NOSUPPORT = 0x02, -+ TRACE_FLAG_NEED_RESCHED = 0x04, -+ TRACE_FLAG_HARDIRQ = 0x08, -+ TRACE_FLAG_SOFTIRQ = 0x10, -+ TRACE_FLAG_PREEMPT_RESCHED = 0x20, -+ TRACE_FLAG_NMI = 0x40, -+}; -+ -+#ifdef CONFIG_TRACE_IRQFLAGS_SUPPORT -+static inline unsigned int tracing_gen_ctx_flags(unsigned long irqflags) -+{ -+ unsigned int irq_status = irqs_disabled_flags(irqflags) ? -+ TRACE_FLAG_IRQS_OFF : 0; -+ return tracing_gen_ctx_irq_test(irq_status); -+} -+static inline unsigned int tracing_gen_ctx(void) -+{ -+ unsigned long irqflags; -+ -+ local_save_flags(irqflags); -+ return tracing_gen_ctx_flags(irqflags); -+} -+#else -+ -+static inline unsigned int tracing_gen_ctx_flags(unsigned long irqflags) -+{ -+ return tracing_gen_ctx_irq_test(TRACE_FLAG_IRQS_NOSUPPORT); -+} -+static inline unsigned int tracing_gen_ctx(void) -+{ -+ return tracing_gen_ctx_irq_test(TRACE_FLAG_IRQS_NOSUPPORT); -+} -+#endif -+ -+static inline unsigned int tracing_gen_ctx_dec(void) -+{ -+ unsigned int trace_ctx; -+ -+ trace_ctx = tracing_gen_ctx(); -+ /* -+ * Subtract one from the preeption counter if preemption is enabled, -+ * see trace_event_buffer_reserve()for details. -+ */ -+ if (IS_ENABLED(CONFIG_PREEMPTION)) -+ trace_ctx--; -+ return trace_ctx; -+} - - struct trace_event_file; - -diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c -index bb1ffaaede17..992bebf4edf0 100644 ---- a/kernel/trace/trace.c -+++ b/kernel/trace/trace.c -@@ -2578,20 +2578,13 @@ enum print_line_t trace_handle_return(struct trace_seq *s) - } - EXPORT_SYMBOL_GPL(trace_handle_return); - --unsigned int tracing_gen_ctx_flags(unsigned long irqflags) -+unsigned int tracing_gen_ctx_irq_test(unsigned int irqs_status) - { -- unsigned int trace_flags = 0; -+ unsigned int trace_flags = irqs_status; - unsigned int pc; - - pc = preempt_count(); - --#ifdef CONFIG_TRACE_IRQFLAGS_SUPPORT -- if (irqs_disabled_flags(irqflags)) -- trace_flags |= TRACE_FLAG_IRQS_OFF; --#else -- trace_flags |= TRACE_FLAG_IRQS_NOSUPPORT; --#endif -- - if (pc & NMI_MASK) - trace_flags |= TRACE_FLAG_NMI; - if (pc & HARDIRQ_MASK) -@@ -2607,33 +2600,6 @@ unsigned int tracing_gen_ctx_flags(unsigned long irqflags) - return (trace_flags << 16) | (pc & 0xff); - } - --unsigned int tracing_gen_ctx(void) --{ -- unsigned long irqflags; -- --#ifdef CONFIG_TRACE_IRQFLAGS_SUPPORT -- local_save_flags(irqflags); --#else -- irqflags = 0; --#endif -- return tracing_gen_ctx_flags(irqflags); --} -- --unsigned int tracing_gen_ctx_dec(void) --{ -- unsigned int trace_ctx; -- -- trace_ctx = tracing_gen_ctx(); -- -- /* -- * Subtract one from the preeption counter if preemption is enabled, -- * see trace_event_buffer_reserve()for details. -- */ -- if (IS_ENABLED(CONFIG_PREEMPTION)) -- trace_ctx--; -- return trace_ctx; --} -- - struct ring_buffer_event * - trace_buffer_lock_reserve(struct trace_buffer *buffer, - int type, -diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h -index b37601bee8b5..2ad9faef718a 100644 ---- a/kernel/trace/trace.h -+++ b/kernel/trace/trace.h -@@ -136,25 +136,6 @@ struct kretprobe_trace_entry_head { - unsigned long ret_ip; - }; - --/* -- * trace_flag_type is an enumeration that holds different -- * states when a trace occurs. These are: -- * IRQS_OFF - interrupts were disabled -- * IRQS_NOSUPPORT - arch does not support irqs_disabled_flags -- * NEED_RESCHED - reschedule is requested -- * HARDIRQ - inside an interrupt handler -- * SOFTIRQ - inside a softirq handler -- */ --enum trace_flag_type { -- TRACE_FLAG_IRQS_OFF = 0x01, -- TRACE_FLAG_IRQS_NOSUPPORT = 0x02, -- TRACE_FLAG_NEED_RESCHED = 0x04, -- TRACE_FLAG_HARDIRQ = 0x08, -- TRACE_FLAG_SOFTIRQ = 0x10, -- TRACE_FLAG_PREEMPT_RESCHED = 0x20, -- TRACE_FLAG_NMI = 0x40, --}; -- - #define TRACE_BUF_SIZE 1024 - - struct trace_array; --- -2.30.2 - diff --git a/debian/patches-rt/0079-tracing-Use-in_serving_softirq-to-deduct-softirq-sta.patch b/debian/patches-rt/0079-tracing-Use-in_serving_softirq-to-deduct-softirq-sta.patch deleted file mode 100644 index 182d3285c..000000000 --- a/debian/patches-rt/0079-tracing-Use-in_serving_softirq-to-deduct-softirq-sta.patch +++ /dev/null @@ -1,48 +0,0 @@ -From 164a1ec3cc41d0f88addd03571295cb277d05829 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Wed, 3 Feb 2021 11:05:25 -0500 -Subject: [PATCH 079/296] tracing: Use in_serving_softirq() to deduct softirq - status. -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -PREEMPT_RT does not report "serving softirq" because the tracing core -looks at the preemption counter while PREEMPT_RT does not update it -while processing softirqs in order to remain preemptible. The -information is stored somewhere else. -The in_serving_softirq() macro and the SOFTIRQ_OFFSET define are still -working but not on the preempt-counter. - -Use in_serving_softirq() macro which works on PREEMPT_RT. On !PREEMPT_RT -the compiler (gcc-10 / clang-11) is smart enough to optimize the -in_serving_softirq() related read of the preemption counter away. -The only difference I noticed by using in_serving_softirq() on -!PREEMPT_RT is that gcc-10 implemented tracing_gen_ctx_flags() as -reading FLAG, jmp _tracing_gen_ctx_flags(). Without in_serving_softirq() -it inlined _tracing_gen_ctx_flags() into tracing_gen_ctx_flags(). - -Link: https://lkml.kernel.org/r/20210125194511.3924915-4-bigeasy@linutronix.de - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/trace/trace.c | 3 +-- - 1 file changed, 1 insertion(+), 2 deletions(-) - -diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c -index 992bebf4edf0..57fb77628df1 100644 ---- a/kernel/trace/trace.c -+++ b/kernel/trace/trace.c -@@ -2589,8 +2589,7 @@ unsigned int tracing_gen_ctx_irq_test(unsigned int irqs_status) - trace_flags |= TRACE_FLAG_NMI; - if (pc & HARDIRQ_MASK) - trace_flags |= TRACE_FLAG_HARDIRQ; -- -- if (pc & SOFTIRQ_OFFSET) -+ if (in_serving_softirq()) - trace_flags |= TRACE_FLAG_SOFTIRQ; - - if (tif_need_resched()) --- -2.30.2 - diff --git a/debian/patches-rt/0080-tracing-Remove-NULL-check-from-current-in-tracing_ge.patch b/debian/patches-rt/0080-tracing-Remove-NULL-check-from-current-in-tracing_ge.patch deleted file mode 100644 index 1f965fe8f..000000000 --- a/debian/patches-rt/0080-tracing-Remove-NULL-check-from-current-in-tracing_ge.patch +++ /dev/null @@ -1,43 +0,0 @@ -From 70730b093937a4de6b7e0446306c8cb38e9ed524 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Wed, 3 Feb 2021 11:05:26 -0500 -Subject: [PATCH 080/296] tracing: Remove NULL check from current in - tracing_generic_entry_update(). -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -I can't imagine when or why `current' would return a NULL pointer. This -check was added in commit - 72829bc3d63cd ("ftrace: move enums to ftrace.h and make helper function global") - -but it doesn't give me hint why it was needed. - -Assume `current' never returns a NULL pointer and remove the check. - -Link: https://lkml.kernel.org/r/20210125194511.3924915-5-bigeasy@linutronix.de - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/trace_events.h | 4 +--- - 1 file changed, 1 insertion(+), 3 deletions(-) - -diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h -index 67ae708de40d..5d1eeac4bfbe 100644 ---- a/include/linux/trace_events.h -+++ b/include/linux/trace_events.h -@@ -152,10 +152,8 @@ static inline void tracing_generic_entry_update(struct trace_entry *entry, - unsigned short type, - unsigned int trace_ctx) - { -- struct task_struct *tsk = current; -- - entry->preempt_count = trace_ctx & 0xff; -- entry->pid = (tsk) ? tsk->pid : 0; -+ entry->pid = current->pid; - entry->type = type; - entry->flags = trace_ctx >> 16; - } --- -2.30.2 - diff --git a/debian/patches-rt/0081-printk-inline-log_output-log_store-in-vprintk_store.patch b/debian/patches-rt/0081-printk-inline-log_output-log_store-in-vprintk_store.patch deleted file mode 100644 index 603f8e1ef..000000000 --- a/debian/patches-rt/0081-printk-inline-log_output-log_store-in-vprintk_store.patch +++ /dev/null @@ -1,201 +0,0 @@ -From ad449b446e2fc97122820956f41832f5886c66db Mon Sep 17 00:00:00 2001 -From: John Ogness <john.ogness@linutronix.de> -Date: Wed, 9 Dec 2020 01:50:52 +0106 -Subject: [PATCH 081/296] printk: inline log_output(),log_store() in - vprintk_store() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -In preparation for removing logbuf_lock, inline log_output() -and log_store() into vprintk_store(). This will simplify dealing -with the various code branches and fallbacks that are possible. - -Signed-off-by: John Ogness <john.ogness@linutronix.de> -Reviewed-by: Petr Mladek <pmladek@suse.com> -Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> -Signed-off-by: Petr Mladek <pmladek@suse.com> -Link: https://lore.kernel.org/r/20201209004453.17720-2-john.ogness@linutronix.de -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/printk/printk.c | 145 +++++++++++++++++++---------------------- - 1 file changed, 67 insertions(+), 78 deletions(-) - -diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c -index d0df95346ab3..f9953f12f3ca 100644 ---- a/kernel/printk/printk.c -+++ b/kernel/printk/printk.c -@@ -491,52 +491,6 @@ static void truncate_msg(u16 *text_len, u16 *trunc_msg_len) - *trunc_msg_len = 0; - } - --/* insert record into the buffer, discard old ones, update heads */ --static int log_store(u32 caller_id, int facility, int level, -- enum log_flags flags, u64 ts_nsec, -- const struct dev_printk_info *dev_info, -- const char *text, u16 text_len) --{ -- struct prb_reserved_entry e; -- struct printk_record r; -- u16 trunc_msg_len = 0; -- -- prb_rec_init_wr(&r, text_len); -- -- if (!prb_reserve(&e, prb, &r)) { -- /* truncate the message if it is too long for empty buffer */ -- truncate_msg(&text_len, &trunc_msg_len); -- prb_rec_init_wr(&r, text_len + trunc_msg_len); -- /* survive when the log buffer is too small for trunc_msg */ -- if (!prb_reserve(&e, prb, &r)) -- return 0; -- } -- -- /* fill message */ -- memcpy(&r.text_buf[0], text, text_len); -- if (trunc_msg_len) -- memcpy(&r.text_buf[text_len], trunc_msg, trunc_msg_len); -- r.info->text_len = text_len + trunc_msg_len; -- r.info->facility = facility; -- r.info->level = level & 7; -- r.info->flags = flags & 0x1f; -- if (ts_nsec > 0) -- r.info->ts_nsec = ts_nsec; -- else -- r.info->ts_nsec = local_clock(); -- r.info->caller_id = caller_id; -- if (dev_info) -- memcpy(&r.info->dev_info, dev_info, sizeof(r.info->dev_info)); -- -- /* A message without a trailing newline can be continued. */ -- if (!(flags & LOG_NEWLINE)) -- prb_commit(&e); -- else -- prb_final_commit(&e); -- -- return (text_len + trunc_msg_len); --} -- - int dmesg_restrict = IS_ENABLED(CONFIG_SECURITY_DMESG_RESTRICT); - - static int syslog_action_restricted(int type) -@@ -1931,44 +1885,28 @@ static inline u32 printk_caller_id(void) - 0x80000000 + raw_smp_processor_id(); - } - --static size_t log_output(int facility, int level, enum log_flags lflags, -- const struct dev_printk_info *dev_info, -- char *text, size_t text_len) --{ -- const u32 caller_id = printk_caller_id(); -- -- if (lflags & LOG_CONT) { -- struct prb_reserved_entry e; -- struct printk_record r; -- -- prb_rec_init_wr(&r, text_len); -- if (prb_reserve_in_last(&e, prb, &r, caller_id, LOG_LINE_MAX)) { -- memcpy(&r.text_buf[r.info->text_len], text, text_len); -- r.info->text_len += text_len; -- if (lflags & LOG_NEWLINE) { -- r.info->flags |= LOG_NEWLINE; -- prb_final_commit(&e); -- } else { -- prb_commit(&e); -- } -- return text_len; -- } -- } -- -- /* Store it in the record log */ -- return log_store(caller_id, facility, level, lflags, 0, -- dev_info, text, text_len); --} -- - /* Must be called under logbuf_lock. */ - int vprintk_store(int facility, int level, - const struct dev_printk_info *dev_info, - const char *fmt, va_list args) - { -+ const u32 caller_id = printk_caller_id(); - static char textbuf[LOG_LINE_MAX]; -- char *text = textbuf; -- size_t text_len; -+ struct prb_reserved_entry e; - enum log_flags lflags = 0; -+ struct printk_record r; -+ u16 trunc_msg_len = 0; -+ char *text = textbuf; -+ u16 text_len; -+ u64 ts_nsec; -+ -+ /* -+ * Since the duration of printk() can vary depending on the message -+ * and state of the ringbuffer, grab the timestamp now so that it is -+ * close to the call of printk(). This provides a more deterministic -+ * timestamp with respect to the caller. -+ */ -+ ts_nsec = local_clock(); - - /* - * The printf needs to come first; we need the syslog -@@ -2007,7 +1945,58 @@ int vprintk_store(int facility, int level, - if (dev_info) - lflags |= LOG_NEWLINE; - -- return log_output(facility, level, lflags, dev_info, text, text_len); -+ if (lflags & LOG_CONT) { -+ prb_rec_init_wr(&r, text_len); -+ if (prb_reserve_in_last(&e, prb, &r, caller_id, LOG_LINE_MAX)) { -+ memcpy(&r.text_buf[r.info->text_len], text, text_len); -+ r.info->text_len += text_len; -+ -+ if (lflags & LOG_NEWLINE) { -+ r.info->flags |= LOG_NEWLINE; -+ prb_final_commit(&e); -+ } else { -+ prb_commit(&e); -+ } -+ -+ return text_len; -+ } -+ } -+ -+ /* -+ * Explicitly initialize the record before every prb_reserve() call. -+ * prb_reserve_in_last() and prb_reserve() purposely invalidate the -+ * structure when they fail. -+ */ -+ prb_rec_init_wr(&r, text_len); -+ if (!prb_reserve(&e, prb, &r)) { -+ /* truncate the message if it is too long for empty buffer */ -+ truncate_msg(&text_len, &trunc_msg_len); -+ -+ prb_rec_init_wr(&r, text_len + trunc_msg_len); -+ if (!prb_reserve(&e, prb, &r)) -+ return 0; -+ } -+ -+ /* fill message */ -+ memcpy(&r.text_buf[0], text, text_len); -+ if (trunc_msg_len) -+ memcpy(&r.text_buf[text_len], trunc_msg, trunc_msg_len); -+ r.info->text_len = text_len + trunc_msg_len; -+ r.info->facility = facility; -+ r.info->level = level & 7; -+ r.info->flags = lflags & 0x1f; -+ r.info->ts_nsec = ts_nsec; -+ r.info->caller_id = caller_id; -+ if (dev_info) -+ memcpy(&r.info->dev_info, dev_info, sizeof(r.info->dev_info)); -+ -+ /* A message without a trailing newline can be continued. */ -+ if (!(lflags & LOG_NEWLINE)) -+ prb_commit(&e); -+ else -+ prb_final_commit(&e); -+ -+ return (text_len + trunc_msg_len); - } - - asmlinkage int vprintk_emit(int facility, int level, --- -2.30.2 - diff --git a/debian/patches-rt/0082-printk-remove-logbuf_lock-writer-protection-of-ringb.patch b/debian/patches-rt/0082-printk-remove-logbuf_lock-writer-protection-of-ringb.patch deleted file mode 100644 index 9d7474580..000000000 --- a/debian/patches-rt/0082-printk-remove-logbuf_lock-writer-protection-of-ringb.patch +++ /dev/null @@ -1,251 +0,0 @@ -From 095e0ef35deded180c7db87bec11998f606e8147 Mon Sep 17 00:00:00 2001 -From: John Ogness <john.ogness@linutronix.de> -Date: Wed, 9 Dec 2020 01:50:53 +0106 -Subject: [PATCH 082/296] printk: remove logbuf_lock writer-protection of - ringbuffer -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Since the ringbuffer is lockless, there is no need for it to be -protected by @logbuf_lock. Remove @logbuf_lock writer-protection of -the ringbuffer. The reader-protection is not removed because some -variables, used by readers, are using @logbuf_lock for synchronization: -@syslog_seq, @syslog_time, @syslog_partial, @console_seq, -struct kmsg_dumper. - -For PRINTK_NMI_DIRECT_CONTEXT_MASK, @logbuf_lock usage is not removed -because it may be used for dumper synchronization. - -Without @logbuf_lock synchronization of vprintk_store() it is no -longer possible to use the single static buffer for temporarily -sprint'ing the message. Instead, use vsnprintf() to determine the -length and perform the real vscnprintf() using the area reserved from -the ringbuffer. This leads to suboptimal packing of the message data, -but will result in less wasted storage than multiple per-cpu buffers -to support lockless temporary sprint'ing. - -Signed-off-by: John Ogness <john.ogness@linutronix.de> -Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> -Reviewed-by: Petr Mladek <pmladek@suse.com> -Signed-off-by: Petr Mladek <pmladek@suse.com> -Link: https://lore.kernel.org/r/20201209004453.17720-3-john.ogness@linutronix.de -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/printk/printk.c | 138 +++++++++++++++++++++++++++++------------ - 1 file changed, 98 insertions(+), 40 deletions(-) - -diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c -index f9953f12f3ca..c7239d169bbe 100644 ---- a/kernel/printk/printk.c -+++ b/kernel/printk/printk.c -@@ -1127,7 +1127,7 @@ void __init setup_log_buf(int early) - new_descs, ilog2(new_descs_count), - new_infos); - -- logbuf_lock_irqsave(flags); -+ printk_safe_enter_irqsave(flags); - - log_buf_len = new_log_buf_len; - log_buf = new_log_buf; -@@ -1144,7 +1144,7 @@ void __init setup_log_buf(int early) - */ - prb = &printk_rb_dynamic; - -- logbuf_unlock_irqrestore(flags); -+ printk_safe_exit_irqrestore(flags); - - if (seq != prb_next_seq(&printk_rb_static)) { - pr_err("dropped %llu messages\n", -@@ -1885,18 +1885,90 @@ static inline u32 printk_caller_id(void) - 0x80000000 + raw_smp_processor_id(); - } - --/* Must be called under logbuf_lock. */ -+/** -+ * parse_prefix - Parse level and control flags. -+ * -+ * @text: The terminated text message. -+ * @level: A pointer to the current level value, will be updated. -+ * @lflags: A pointer to the current log flags, will be updated. -+ * -+ * @level may be NULL if the caller is not interested in the parsed value. -+ * Otherwise the variable pointed to by @level must be set to -+ * LOGLEVEL_DEFAULT in order to be updated with the parsed value. -+ * -+ * @lflags may be NULL if the caller is not interested in the parsed value. -+ * Otherwise the variable pointed to by @lflags will be OR'd with the parsed -+ * value. -+ * -+ * Return: The length of the parsed level and control flags. -+ */ -+static u16 parse_prefix(char *text, int *level, enum log_flags *lflags) -+{ -+ u16 prefix_len = 0; -+ int kern_level; -+ -+ while (*text) { -+ kern_level = printk_get_level(text); -+ if (!kern_level) -+ break; -+ -+ switch (kern_level) { -+ case '0' ... '7': -+ if (level && *level == LOGLEVEL_DEFAULT) -+ *level = kern_level - '0'; -+ break; -+ case 'c': /* KERN_CONT */ -+ if (lflags) -+ *lflags |= LOG_CONT; -+ } -+ -+ prefix_len += 2; -+ text += 2; -+ } -+ -+ return prefix_len; -+} -+ -+static u16 printk_sprint(char *text, u16 size, int facility, enum log_flags *lflags, -+ const char *fmt, va_list args) -+{ -+ u16 text_len; -+ -+ text_len = vscnprintf(text, size, fmt, args); -+ -+ /* Mark and strip a trailing newline. */ -+ if (text_len && text[text_len - 1] == '\n') { -+ text_len--; -+ *lflags |= LOG_NEWLINE; -+ } -+ -+ /* Strip log level and control flags. */ -+ if (facility == 0) { -+ u16 prefix_len; -+ -+ prefix_len = parse_prefix(text, NULL, NULL); -+ if (prefix_len) { -+ text_len -= prefix_len; -+ memmove(text, text + prefix_len, text_len); -+ } -+ } -+ -+ return text_len; -+} -+ -+__printf(4, 0) - int vprintk_store(int facility, int level, - const struct dev_printk_info *dev_info, - const char *fmt, va_list args) - { - const u32 caller_id = printk_caller_id(); -- static char textbuf[LOG_LINE_MAX]; - struct prb_reserved_entry e; - enum log_flags lflags = 0; - struct printk_record r; - u16 trunc_msg_len = 0; -- char *text = textbuf; -+ char prefix_buf[8]; -+ u16 reserve_size; -+ va_list args2; - u16 text_len; - u64 ts_nsec; - -@@ -1909,35 +1981,21 @@ int vprintk_store(int facility, int level, - ts_nsec = local_clock(); - - /* -- * The printf needs to come first; we need the syslog -- * prefix which might be passed-in as a parameter. -+ * The sprintf needs to come first since the syslog prefix might be -+ * passed in as a parameter. An extra byte must be reserved so that -+ * later the vscnprintf() into the reserved buffer has room for the -+ * terminating '\0', which is not counted by vsnprintf(). - */ -- text_len = vscnprintf(text, sizeof(textbuf), fmt, args); -- -- /* mark and strip a trailing newline */ -- if (text_len && text[text_len-1] == '\n') { -- text_len--; -- lflags |= LOG_NEWLINE; -- } -- -- /* strip kernel syslog prefix and extract log level or control flags */ -- if (facility == 0) { -- int kern_level; -+ va_copy(args2, args); -+ reserve_size = vsnprintf(&prefix_buf[0], sizeof(prefix_buf), fmt, args2) + 1; -+ va_end(args2); - -- while ((kern_level = printk_get_level(text)) != 0) { -- switch (kern_level) { -- case '0' ... '7': -- if (level == LOGLEVEL_DEFAULT) -- level = kern_level - '0'; -- break; -- case 'c': /* KERN_CONT */ -- lflags |= LOG_CONT; -- } -+ if (reserve_size > LOG_LINE_MAX) -+ reserve_size = LOG_LINE_MAX; - -- text_len -= 2; -- text += 2; -- } -- } -+ /* Extract log level or control flags. */ -+ if (facility == 0) -+ parse_prefix(&prefix_buf[0], &level, &lflags); - - if (level == LOGLEVEL_DEFAULT) - level = default_message_loglevel; -@@ -1946,9 +2004,10 @@ int vprintk_store(int facility, int level, - lflags |= LOG_NEWLINE; - - if (lflags & LOG_CONT) { -- prb_rec_init_wr(&r, text_len); -+ prb_rec_init_wr(&r, reserve_size); - if (prb_reserve_in_last(&e, prb, &r, caller_id, LOG_LINE_MAX)) { -- memcpy(&r.text_buf[r.info->text_len], text, text_len); -+ text_len = printk_sprint(&r.text_buf[r.info->text_len], reserve_size, -+ facility, &lflags, fmt, args); - r.info->text_len += text_len; - - if (lflags & LOG_NEWLINE) { -@@ -1967,18 +2026,18 @@ int vprintk_store(int facility, int level, - * prb_reserve_in_last() and prb_reserve() purposely invalidate the - * structure when they fail. - */ -- prb_rec_init_wr(&r, text_len); -+ prb_rec_init_wr(&r, reserve_size); - if (!prb_reserve(&e, prb, &r)) { - /* truncate the message if it is too long for empty buffer */ -- truncate_msg(&text_len, &trunc_msg_len); -+ truncate_msg(&reserve_size, &trunc_msg_len); - -- prb_rec_init_wr(&r, text_len + trunc_msg_len); -+ prb_rec_init_wr(&r, reserve_size + trunc_msg_len); - if (!prb_reserve(&e, prb, &r)) - return 0; - } - - /* fill message */ -- memcpy(&r.text_buf[0], text, text_len); -+ text_len = printk_sprint(&r.text_buf[0], reserve_size, facility, &lflags, fmt, args); - if (trunc_msg_len) - memcpy(&r.text_buf[text_len], trunc_msg, trunc_msg_len); - r.info->text_len = text_len + trunc_msg_len; -@@ -2019,10 +2078,9 @@ asmlinkage int vprintk_emit(int facility, int level, - boot_delay_msec(level); - printk_delay(); - -- /* This stops the holder of console_sem just where we want him */ -- logbuf_lock_irqsave(flags); -+ printk_safe_enter_irqsave(flags); - printed_len = vprintk_store(facility, level, dev_info, fmt, args); -- logbuf_unlock_irqrestore(flags); -+ printk_safe_exit_irqrestore(flags); - - /* If called from the scheduler, we can not call up(). */ - if (!in_sched) { --- -2.30.2 - diff --git a/debian/patches-rt/0083-printk-limit-second-loop-of-syslog_print_all.patch b/debian/patches-rt/0083-printk-limit-second-loop-of-syslog_print_all.patch deleted file mode 100644 index ea2bd87fb..000000000 --- a/debian/patches-rt/0083-printk-limit-second-loop-of-syslog_print_all.patch +++ /dev/null @@ -1,56 +0,0 @@ -From e33348e1d0fec8e062f7cf7b527d5859f422abd4 Mon Sep 17 00:00:00 2001 -From: John Ogness <john.ogness@linutronix.de> -Date: Wed, 17 Feb 2021 16:15:31 +0100 -Subject: [PATCH 083/296] printk: limit second loop of syslog_print_all -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The second loop of syslog_print_all() subtracts lengths that were -added in the first loop. With commit b031a684bfd0 ("printk: remove -logbuf_lock writer-protection of ringbuffer") it is possible that -records are (over)written during syslog_print_all(). This allows the -possibility of the second loop subtracting lengths that were never -added in the first loop. - -This situation can result in syslog_print_all() filling the buffer -starting from a later record, even though there may have been room -to fit the earlier record(s) as well. - -Fixes: b031a684bfd0 ("printk: remove logbuf_lock writer-protection of ringbuffer") -Signed-off-by: John Ogness <john.ogness@linutronix.de> -Reviewed-by: Petr Mladek <pmladek@suse.com> ---- - kernel/printk/printk.c | 9 ++++++++- - 1 file changed, 8 insertions(+), 1 deletion(-) - -diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c -index c7239d169bbe..411787b900ac 100644 ---- a/kernel/printk/printk.c -+++ b/kernel/printk/printk.c -@@ -1495,6 +1495,7 @@ static int syslog_print_all(char __user *buf, int size, bool clear) - struct printk_info info; - unsigned int line_count; - struct printk_record r; -+ u64 max_seq; - char *text; - int len = 0; - u64 seq; -@@ -1513,9 +1514,15 @@ static int syslog_print_all(char __user *buf, int size, bool clear) - prb_for_each_info(clear_seq, prb, seq, &info, &line_count) - len += get_record_print_text_size(&info, line_count, true, time); - -+ /* -+ * Set an upper bound for the next loop to avoid subtracting lengths -+ * that were never added. -+ */ -+ max_seq = seq; -+ - /* move first record forward until length fits into the buffer */ - prb_for_each_info(clear_seq, prb, seq, &info, &line_count) { -- if (len <= size) -+ if (len <= size || info.seq >= max_seq) - break; - len -= get_record_print_text_size(&info, line_count, true, time); - } --- -2.30.2 - diff --git a/debian/patches-rt/0084-printk-kmsg_dump-remove-unused-fields.patch b/debian/patches-rt/0084-printk-kmsg_dump-remove-unused-fields.patch deleted file mode 100644 index 8c172bdca..000000000 --- a/debian/patches-rt/0084-printk-kmsg_dump-remove-unused-fields.patch +++ /dev/null @@ -1,43 +0,0 @@ -From 4daa9d58ef8074f8d2e5d32e90390e1fe48903cb Mon Sep 17 00:00:00 2001 -From: John Ogness <john.ogness@linutronix.de> -Date: Mon, 21 Dec 2020 11:19:39 +0106 -Subject: [PATCH 084/296] printk: kmsg_dump: remove unused fields -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -struct kmsg_dumper still contains some fields that were used to -iterate the old ringbuffer. They are no longer used. Remove them -and update the struct documentation. - -Signed-off-by: John Ogness <john.ogness@linutronix.de> -Reviewed-by: Petr Mladek <pmladek@suse.com> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/kmsg_dump.h | 5 +++-- - 1 file changed, 3 insertions(+), 2 deletions(-) - -diff --git a/include/linux/kmsg_dump.h b/include/linux/kmsg_dump.h -index 3378bcbe585e..235c50982c2d 100644 ---- a/include/linux/kmsg_dump.h -+++ b/include/linux/kmsg_dump.h -@@ -36,6 +36,9 @@ enum kmsg_dump_reason { - * through the record iterator - * @max_reason: filter for highest reason number that should be dumped - * @registered: Flag that specifies if this is already registered -+ * @active: Flag that specifies if this is currently dumping -+ * @cur_seq: Points to the oldest message to dump (private) -+ * @next_seq: Points after the newest message to dump (private) - */ - struct kmsg_dumper { - struct list_head list; -@@ -45,8 +48,6 @@ struct kmsg_dumper { - bool registered; - - /* private state of the kmsg iterator */ -- u32 cur_idx; -- u32 next_idx; - u64 cur_seq; - u64 next_seq; - }; --- -2.30.2 - diff --git a/debian/patches-rt/0085-printk-refactor-kmsg_dump_get_buffer.patch b/debian/patches-rt/0085-printk-refactor-kmsg_dump_get_buffer.patch deleted file mode 100644 index 59433fcf0..000000000 --- a/debian/patches-rt/0085-printk-refactor-kmsg_dump_get_buffer.patch +++ /dev/null @@ -1,145 +0,0 @@ -From 4e12561f1d5b3af25e1394a63dd37a0e6064ceda Mon Sep 17 00:00:00 2001 -From: John Ogness <john.ogness@linutronix.de> -Date: Mon, 30 Nov 2020 01:41:56 +0106 -Subject: [PATCH 085/296] printk: refactor kmsg_dump_get_buffer() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -kmsg_dump_get_buffer() requires nearly the same logic as -syslog_print_all(), but uses different variable names and -does not make use of the ringbuffer loop macros. Modify -kmsg_dump_get_buffer() so that the implementation is as similar -to syslog_print_all() as possible. - -A follow-up commit will move this common logic into a -separate helper function. - -Signed-off-by: John Ogness <john.ogness@linutronix.de> -Reviewed-by: Petr Mladek <pmladek@suse.com> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/kmsg_dump.h | 2 +- - kernel/printk/printk.c | 60 +++++++++++++++++++++------------------ - 2 files changed, 33 insertions(+), 29 deletions(-) - -diff --git a/include/linux/kmsg_dump.h b/include/linux/kmsg_dump.h -index 235c50982c2d..4095a34db0fa 100644 ---- a/include/linux/kmsg_dump.h -+++ b/include/linux/kmsg_dump.h -@@ -62,7 +62,7 @@ bool kmsg_dump_get_line(struct kmsg_dumper *dumper, bool syslog, - char *line, size_t size, size_t *len); - - bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog, -- char *buf, size_t size, size_t *len); -+ char *buf, size_t size, size_t *len_out); - - void kmsg_dump_rewind_nolock(struct kmsg_dumper *dumper); - -diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c -index 411787b900ac..b4f72b5f70b9 100644 ---- a/kernel/printk/printk.c -+++ b/kernel/printk/printk.c -@@ -3420,7 +3420,7 @@ EXPORT_SYMBOL_GPL(kmsg_dump_get_line); - * read. - */ - bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog, -- char *buf, size_t size, size_t *len) -+ char *buf, size_t size, size_t *len_out) - { - struct printk_info info; - unsigned int line_count; -@@ -3428,12 +3428,10 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog, - unsigned long flags; - u64 seq; - u64 next_seq; -- size_t l = 0; -+ size_t len = 0; - bool ret = false; - bool time = printk_time; - -- prb_rec_init_rd(&r, &info, buf, size); -- - if (!dumper->active || !buf || !size) - goto out; - -@@ -3451,48 +3449,54 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog, - goto out; - } - -- /* calculate length of entire buffer */ -- seq = dumper->cur_seq; -- while (prb_read_valid_info(prb, seq, &info, &line_count)) { -- if (r.info->seq >= dumper->next_seq) -+ /* -+ * Find first record that fits, including all following records, -+ * into the user-provided buffer for this dump. -+ */ -+ -+ prb_for_each_info(dumper->cur_seq, prb, seq, &info, &line_count) { -+ if (info.seq >= dumper->next_seq) - break; -- l += get_record_print_text_size(&info, line_count, syslog, time); -- seq = r.info->seq + 1; -+ len += get_record_print_text_size(&info, line_count, syslog, time); - } - -- /* move first record forward until length fits into the buffer */ -- seq = dumper->cur_seq; -- while (l >= size && prb_read_valid_info(prb, seq, -- &info, &line_count)) { -- if (r.info->seq >= dumper->next_seq) -+ /* -+ * Move first record forward until length fits into the buffer. Ignore -+ * newest messages that were not counted in the above cycle. Messages -+ * might appear and get lost in the meantime. This is the best effort -+ * that prevents an infinite loop. -+ */ -+ prb_for_each_info(dumper->cur_seq, prb, seq, &info, &line_count) { -+ if (len < size || info.seq >= dumper->next_seq) - break; -- l -= get_record_print_text_size(&info, line_count, syslog, time); -- seq = r.info->seq + 1; -+ len -= get_record_print_text_size(&info, line_count, syslog, time); - } - -- /* last message in next interation */ -+ /* -+ * Next kmsg_dump_get_buffer() invocation will dump block of -+ * older records stored right before this one. -+ */ - next_seq = seq; - -- /* actually read text into the buffer now */ -- l = 0; -- while (prb_read_valid(prb, seq, &r)) { -+ prb_rec_init_rd(&r, &info, buf, size); -+ -+ len = 0; -+ prb_for_each_record(seq, prb, seq, &r) { - if (r.info->seq >= dumper->next_seq) - break; - -- l += record_print_text(&r, syslog, time); -- -- /* adjust record to store to remaining buffer space */ -- prb_rec_init_rd(&r, &info, buf + l, size - l); -+ len += record_print_text(&r, syslog, time); - -- seq = r.info->seq + 1; -+ /* Adjust record to store to remaining buffer space. */ -+ prb_rec_init_rd(&r, &info, buf + len, size - len); - } - - dumper->next_seq = next_seq; - ret = true; - logbuf_unlock_irqrestore(flags); - out: -- if (len) -- *len = l; -+ if (len_out) -+ *len_out = len; - return ret; - } - EXPORT_SYMBOL_GPL(kmsg_dump_get_buffer); --- -2.30.2 - diff --git a/debian/patches-rt/0086-printk-consolidate-kmsg_dump_get_buffer-syslog_print.patch b/debian/patches-rt/0086-printk-consolidate-kmsg_dump_get_buffer-syslog_print.patch deleted file mode 100644 index 0bbd4bf12..000000000 --- a/debian/patches-rt/0086-printk-consolidate-kmsg_dump_get_buffer-syslog_print.patch +++ /dev/null @@ -1,147 +0,0 @@ -From a6f0a3743121175b9988195f849c0ea1fdf0c5bc Mon Sep 17 00:00:00 2001 -From: John Ogness <john.ogness@linutronix.de> -Date: Wed, 13 Jan 2021 11:29:53 +0106 -Subject: [PATCH 086/296] printk: consolidate - kmsg_dump_get_buffer/syslog_print_all code -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The logic for finding records to fit into a buffer is the same for -kmsg_dump_get_buffer() and syslog_print_all(). Introduce a helper -function find_first_fitting_seq() to handle this logic. - -Signed-off-by: John Ogness <john.ogness@linutronix.de> ---- - kernel/printk/printk.c | 87 ++++++++++++++++++++++++------------------ - 1 file changed, 50 insertions(+), 37 deletions(-) - -diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c -index b4f72b5f70b9..d6f93ebd7bd0 100644 ---- a/kernel/printk/printk.c -+++ b/kernel/printk/printk.c -@@ -1422,6 +1422,50 @@ static size_t get_record_print_text_size(struct printk_info *info, - return ((prefix_len * line_count) + info->text_len + 1); - } - -+/* -+ * Beginning with @start_seq, find the first record where it and all following -+ * records up to (but not including) @max_seq fit into @size. -+ * -+ * @max_seq is simply an upper bound and does not need to exist. If the caller -+ * does not require an upper bound, -1 can be used for @max_seq. -+ */ -+static u64 find_first_fitting_seq(u64 start_seq, u64 max_seq, size_t size, -+ bool syslog, bool time) -+{ -+ struct printk_info info; -+ unsigned int line_count; -+ size_t len = 0; -+ u64 seq; -+ -+ /* Determine the size of the records up to @max_seq. */ -+ prb_for_each_info(start_seq, prb, seq, &info, &line_count) { -+ if (info.seq >= max_seq) -+ break; -+ len += get_record_print_text_size(&info, line_count, syslog, time); -+ } -+ -+ /* -+ * Adjust the upper bound for the next loop to avoid subtracting -+ * lengths that were never added. -+ */ -+ if (seq < max_seq) -+ max_seq = seq; -+ -+ /* -+ * Move first record forward until length fits into the buffer. Ignore -+ * newest messages that were not counted in the above cycle. Messages -+ * might appear and get lost in the meantime. This is a best effort -+ * that prevents an infinite loop that could occur with a retry. -+ */ -+ prb_for_each_info(start_seq, prb, seq, &info, &line_count) { -+ if (len <= size || info.seq >= max_seq) -+ break; -+ len -= get_record_print_text_size(&info, line_count, syslog, time); -+ } -+ -+ return seq; -+} -+ - static int syslog_print(char __user *buf, int size) - { - struct printk_info info; -@@ -1493,9 +1537,7 @@ static int syslog_print(char __user *buf, int size) - static int syslog_print_all(char __user *buf, int size, bool clear) - { - struct printk_info info; -- unsigned int line_count; - struct printk_record r; -- u64 max_seq; - char *text; - int len = 0; - u64 seq; -@@ -1511,21 +1553,7 @@ static int syslog_print_all(char __user *buf, int size, bool clear) - * Find first record that fits, including all following records, - * into the user-provided buffer for this dump. - */ -- prb_for_each_info(clear_seq, prb, seq, &info, &line_count) -- len += get_record_print_text_size(&info, line_count, true, time); -- -- /* -- * Set an upper bound for the next loop to avoid subtracting lengths -- * that were never added. -- */ -- max_seq = seq; -- -- /* move first record forward until length fits into the buffer */ -- prb_for_each_info(clear_seq, prb, seq, &info, &line_count) { -- if (len <= size || info.seq >= max_seq) -- break; -- len -= get_record_print_text_size(&info, line_count, true, time); -- } -+ seq = find_first_fitting_seq(clear_seq, -1, size, true, time); - - prb_rec_init_rd(&r, &info, text, LOG_LINE_MAX + PREFIX_MAX); - -@@ -3423,7 +3451,6 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog, - char *buf, size_t size, size_t *len_out) - { - struct printk_info info; -- unsigned int line_count; - struct printk_record r; - unsigned long flags; - u64 seq; -@@ -3451,26 +3478,12 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog, - - /* - * Find first record that fits, including all following records, -- * into the user-provided buffer for this dump. -+ * into the user-provided buffer for this dump. Pass in size-1 -+ * because this function (by way of record_print_text()) will -+ * not write more than size-1 bytes of text into @buf. - */ -- -- prb_for_each_info(dumper->cur_seq, prb, seq, &info, &line_count) { -- if (info.seq >= dumper->next_seq) -- break; -- len += get_record_print_text_size(&info, line_count, syslog, time); -- } -- -- /* -- * Move first record forward until length fits into the buffer. Ignore -- * newest messages that were not counted in the above cycle. Messages -- * might appear and get lost in the meantime. This is the best effort -- * that prevents an infinite loop. -- */ -- prb_for_each_info(dumper->cur_seq, prb, seq, &info, &line_count) { -- if (len < size || info.seq >= dumper->next_seq) -- break; -- len -= get_record_print_text_size(&info, line_count, syslog, time); -- } -+ seq = find_first_fitting_seq(dumper->cur_seq, dumper->next_seq, -+ size - 1, syslog, time); - - /* - * Next kmsg_dump_get_buffer() invocation will dump block of --- -2.30.2 - diff --git a/debian/patches-rt/0087-printk-introduce-CONSOLE_LOG_MAX-for-improved-multi-.patch b/debian/patches-rt/0087-printk-introduce-CONSOLE_LOG_MAX-for-improved-multi-.patch deleted file mode 100644 index b5dbd058b..000000000 --- a/debian/patches-rt/0087-printk-introduce-CONSOLE_LOG_MAX-for-improved-multi-.patch +++ /dev/null @@ -1,95 +0,0 @@ -From 3c2aa05a1def1523e62b65735da85fd6833cf640 Mon Sep 17 00:00:00 2001 -From: John Ogness <john.ogness@linutronix.de> -Date: Thu, 10 Dec 2020 12:48:01 +0106 -Subject: [PATCH 087/296] printk: introduce CONSOLE_LOG_MAX for improved - multi-line support -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Instead of using "LOG_LINE_MAX + PREFIX_MAX" for temporary buffer -sizes, introduce CONSOLE_LOG_MAX. This represents the maximum size -that is allowed to be printed to the console for a single record. - -Rather than setting CONSOLE_LOG_MAX to "LOG_LINE_MAX + PREFIX_MAX" -(1024), increase it to 4096. With a larger buffer size, multi-line -records that are nearly LOG_LINE_MAX in length will have a better -chance of being fully printed. (When formatting a record for the -console, each line of a multi-line record is prepended with a copy -of the prefix.) - -Signed-off-by: John Ogness <john.ogness@linutronix.de> ---- - kernel/printk/printk.c | 18 +++++++++++------- - 1 file changed, 11 insertions(+), 7 deletions(-) - -diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c -index d6f93ebd7bd0..f79e7515b5f1 100644 ---- a/kernel/printk/printk.c -+++ b/kernel/printk/printk.c -@@ -410,8 +410,13 @@ static u64 clear_seq; - #else - #define PREFIX_MAX 32 - #endif -+ -+/* the maximum size allowed to be reserved for a record */ - #define LOG_LINE_MAX (1024 - PREFIX_MAX) - -+/* the maximum size of a formatted record (i.e. with prefix added per line) */ -+#define CONSOLE_LOG_MAX 4096 -+ - #define LOG_LEVEL(v) ((v) & 0x07) - #define LOG_FACILITY(v) ((v) >> 3 & 0xff) - -@@ -1473,11 +1478,11 @@ static int syslog_print(char __user *buf, int size) - char *text; - int len = 0; - -- text = kmalloc(LOG_LINE_MAX + PREFIX_MAX, GFP_KERNEL); -+ text = kmalloc(CONSOLE_LOG_MAX, GFP_KERNEL); - if (!text) - return -ENOMEM; - -- prb_rec_init_rd(&r, &info, text, LOG_LINE_MAX + PREFIX_MAX); -+ prb_rec_init_rd(&r, &info, text, CONSOLE_LOG_MAX); - - while (size > 0) { - size_t n; -@@ -1543,7 +1548,7 @@ static int syslog_print_all(char __user *buf, int size, bool clear) - u64 seq; - bool time; - -- text = kmalloc(LOG_LINE_MAX + PREFIX_MAX, GFP_KERNEL); -+ text = kmalloc(CONSOLE_LOG_MAX, GFP_KERNEL); - if (!text) - return -ENOMEM; - -@@ -1555,7 +1560,7 @@ static int syslog_print_all(char __user *buf, int size, bool clear) - */ - seq = find_first_fitting_seq(clear_seq, -1, size, true, time); - -- prb_rec_init_rd(&r, &info, text, LOG_LINE_MAX + PREFIX_MAX); -+ prb_rec_init_rd(&r, &info, text, CONSOLE_LOG_MAX); - - len = 0; - prb_for_each_record(seq, prb, seq, &r) { -@@ -2188,8 +2193,7 @@ EXPORT_SYMBOL(printk); - - #else /* CONFIG_PRINTK */ - --#define LOG_LINE_MAX 0 --#define PREFIX_MAX 0 -+#define CONSOLE_LOG_MAX 0 - #define printk_time false - - #define prb_read_valid(rb, seq, r) false -@@ -2500,7 +2504,7 @@ static inline int can_use_console(void) - void console_unlock(void) - { - static char ext_text[CONSOLE_EXT_LOG_MAX]; -- static char text[LOG_LINE_MAX + PREFIX_MAX]; -+ static char text[CONSOLE_LOG_MAX]; - unsigned long flags; - bool do_cond_resched, retry; - struct printk_info info; --- -2.30.2 - diff --git a/debian/patches-rt/0088-printk-use-seqcount_latch-for-clear_seq.patch b/debian/patches-rt/0088-printk-use-seqcount_latch-for-clear_seq.patch deleted file mode 100644 index fda9ef2b5..000000000 --- a/debian/patches-rt/0088-printk-use-seqcount_latch-for-clear_seq.patch +++ /dev/null @@ -1,147 +0,0 @@ -From 9349d1ef6b6526cf60f5a2963b13b86ae2f65d7a Mon Sep 17 00:00:00 2001 -From: John Ogness <john.ogness@linutronix.de> -Date: Mon, 30 Nov 2020 01:41:58 +0106 -Subject: [PATCH 088/296] printk: use seqcount_latch for clear_seq -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -kmsg_dump_rewind_nolock() locklessly reads @clear_seq. However, -this is not done atomically. Since @clear_seq is 64-bit, this -cannot be an atomic operation for all platforms. Therefore, use -a seqcount_latch to allow readers to always read a consistent -value. - -Signed-off-by: John Ogness <john.ogness@linutronix.de> -Reviewed-by: Petr Mladek <pmladek@suse.com> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/printk/printk.c | 58 ++++++++++++++++++++++++++++++++++++------ - 1 file changed, 50 insertions(+), 8 deletions(-) - -diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c -index f79e7515b5f1..a71e0d41ccb5 100644 ---- a/kernel/printk/printk.c -+++ b/kernel/printk/printk.c -@@ -402,8 +402,21 @@ static u64 console_seq; - static u64 exclusive_console_stop_seq; - static unsigned long console_dropped; - --/* the next printk record to read after the last 'clear' command */ --static u64 clear_seq; -+struct latched_seq { -+ seqcount_latch_t latch; -+ u64 val[2]; -+}; -+ -+/* -+ * The next printk record to read after the last 'clear' command. There are -+ * two copies (updated with seqcount_latch) so that reads can locklessly -+ * access a valid value. Writers are synchronized by @logbuf_lock. -+ */ -+static struct latched_seq clear_seq = { -+ .latch = SEQCNT_LATCH_ZERO(clear_seq.latch), -+ .val[0] = 0, -+ .val[1] = 0, -+}; - - #ifdef CONFIG_PRINTK_CALLER - #define PREFIX_MAX 48 -@@ -457,6 +470,31 @@ bool printk_percpu_data_ready(void) - return __printk_percpu_data_ready; - } - -+/* Must be called under logbuf_lock. */ -+static void latched_seq_write(struct latched_seq *ls, u64 val) -+{ -+ raw_write_seqcount_latch(&ls->latch); -+ ls->val[0] = val; -+ raw_write_seqcount_latch(&ls->latch); -+ ls->val[1] = val; -+} -+ -+/* Can be called from any context. */ -+static u64 latched_seq_read_nolock(struct latched_seq *ls) -+{ -+ unsigned int seq; -+ unsigned int idx; -+ u64 val; -+ -+ do { -+ seq = raw_read_seqcount_latch(&ls->latch); -+ idx = seq & 0x1; -+ val = ls->val[idx]; -+ } while (read_seqcount_latch_retry(&ls->latch, seq)); -+ -+ return val; -+} -+ - /* Return log buffer address */ - char *log_buf_addr_get(void) - { -@@ -802,7 +840,7 @@ static loff_t devkmsg_llseek(struct file *file, loff_t offset, int whence) - * like issued by 'dmesg -c'. Reading /dev/kmsg itself - * changes no global state, and does not clear anything. - */ -- user->seq = clear_seq; -+ user->seq = latched_seq_read_nolock(&clear_seq); - break; - case SEEK_END: - /* after the last record */ -@@ -961,6 +999,9 @@ void log_buf_vmcoreinfo_setup(void) - - VMCOREINFO_SIZE(atomic_long_t); - VMCOREINFO_TYPE_OFFSET(atomic_long_t, counter); -+ -+ VMCOREINFO_STRUCT_SIZE(latched_seq); -+ VMCOREINFO_OFFSET(latched_seq, val); - } - #endif - -@@ -1558,7 +1599,8 @@ static int syslog_print_all(char __user *buf, int size, bool clear) - * Find first record that fits, including all following records, - * into the user-provided buffer for this dump. - */ -- seq = find_first_fitting_seq(clear_seq, -1, size, true, time); -+ seq = find_first_fitting_seq(latched_seq_read_nolock(&clear_seq), -1, -+ size, true, time); - - prb_rec_init_rd(&r, &info, text, CONSOLE_LOG_MAX); - -@@ -1585,7 +1627,7 @@ static int syslog_print_all(char __user *buf, int size, bool clear) - } - - if (clear) -- clear_seq = seq; -+ latched_seq_write(&clear_seq, seq); - logbuf_unlock_irq(); - - kfree(text); -@@ -1595,7 +1637,7 @@ static int syslog_print_all(char __user *buf, int size, bool clear) - static void syslog_clear(void) - { - logbuf_lock_irq(); -- clear_seq = prb_next_seq(prb); -+ latched_seq_write(&clear_seq, prb_next_seq(prb)); - logbuf_unlock_irq(); - } - -@@ -3332,7 +3374,7 @@ void kmsg_dump(enum kmsg_dump_reason reason) - dumper->active = true; - - logbuf_lock_irqsave(flags); -- dumper->cur_seq = clear_seq; -+ dumper->cur_seq = latched_seq_read_nolock(&clear_seq); - dumper->next_seq = prb_next_seq(prb); - logbuf_unlock_irqrestore(flags); - -@@ -3530,7 +3572,7 @@ EXPORT_SYMBOL_GPL(kmsg_dump_get_buffer); - */ - void kmsg_dump_rewind_nolock(struct kmsg_dumper *dumper) - { -- dumper->cur_seq = clear_seq; -+ dumper->cur_seq = latched_seq_read_nolock(&clear_seq); - dumper->next_seq = prb_next_seq(prb); - } - --- -2.30.2 - diff --git a/debian/patches-rt/0089-printk-use-atomic64_t-for-devkmsg_user.seq.patch b/debian/patches-rt/0089-printk-use-atomic64_t-for-devkmsg_user.seq.patch deleted file mode 100644 index 71d74a331..000000000 --- a/debian/patches-rt/0089-printk-use-atomic64_t-for-devkmsg_user.seq.patch +++ /dev/null @@ -1,112 +0,0 @@ -From 039aa3f5e5410e8f2d0f29d2ca52f99421978724 Mon Sep 17 00:00:00 2001 -From: John Ogness <john.ogness@linutronix.de> -Date: Thu, 10 Dec 2020 15:33:40 +0106 -Subject: [PATCH 089/296] printk: use atomic64_t for devkmsg_user.seq -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -@user->seq is indirectly protected by @logbuf_lock. Once @logbuf_lock -is removed, @user->seq will be no longer safe from an atomicity point -of view. - -In preparation for the removal of @logbuf_lock, change it to -atomic64_t to provide this safety. - -Signed-off-by: John Ogness <john.ogness@linutronix.de> ---- - kernel/printk/printk.c | 22 +++++++++++----------- - 1 file changed, 11 insertions(+), 11 deletions(-) - -diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c -index a71e0d41ccb5..17631a396c92 100644 ---- a/kernel/printk/printk.c -+++ b/kernel/printk/printk.c -@@ -662,7 +662,7 @@ static ssize_t msg_print_ext_body(char *buf, size_t size, - - /* /dev/kmsg - userspace message inject/listen interface */ - struct devkmsg_user { -- u64 seq; -+ atomic64_t seq; - struct ratelimit_state rs; - struct mutex lock; - char buf[CONSOLE_EXT_LOG_MAX]; -@@ -764,7 +764,7 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf, - return ret; - - logbuf_lock_irq(); -- if (!prb_read_valid(prb, user->seq, r)) { -+ if (!prb_read_valid(prb, atomic64_read(&user->seq), r)) { - if (file->f_flags & O_NONBLOCK) { - ret = -EAGAIN; - logbuf_unlock_irq(); -@@ -773,15 +773,15 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf, - - logbuf_unlock_irq(); - ret = wait_event_interruptible(log_wait, -- prb_read_valid(prb, user->seq, r)); -+ prb_read_valid(prb, atomic64_read(&user->seq), r)); - if (ret) - goto out; - logbuf_lock_irq(); - } - -- if (r->info->seq != user->seq) { -+ if (r->info->seq != atomic64_read(&user->seq)) { - /* our last seen message is gone, return error and reset */ -- user->seq = r->info->seq; -+ atomic64_set(&user->seq, r->info->seq); - ret = -EPIPE; - logbuf_unlock_irq(); - goto out; -@@ -792,7 +792,7 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf, - &r->text_buf[0], r->info->text_len, - &r->info->dev_info); - -- user->seq = r->info->seq + 1; -+ atomic64_set(&user->seq, r->info->seq + 1); - logbuf_unlock_irq(); - - if (len > count) { -@@ -832,7 +832,7 @@ static loff_t devkmsg_llseek(struct file *file, loff_t offset, int whence) - switch (whence) { - case SEEK_SET: - /* the first record */ -- user->seq = prb_first_valid_seq(prb); -+ atomic64_set(&user->seq, prb_first_valid_seq(prb)); - break; - case SEEK_DATA: - /* -@@ -840,11 +840,11 @@ static loff_t devkmsg_llseek(struct file *file, loff_t offset, int whence) - * like issued by 'dmesg -c'. Reading /dev/kmsg itself - * changes no global state, and does not clear anything. - */ -- user->seq = latched_seq_read_nolock(&clear_seq); -+ atomic64_set(&user->seq, latched_seq_read_nolock(&clear_seq)); - break; - case SEEK_END: - /* after the last record */ -- user->seq = prb_next_seq(prb); -+ atomic64_set(&user->seq, prb_next_seq(prb)); - break; - default: - ret = -EINVAL; -@@ -867,7 +867,7 @@ static __poll_t devkmsg_poll(struct file *file, poll_table *wait) - logbuf_lock_irq(); - if (prb_read_valid_info(prb, user->seq, &info, NULL)) { - /* return error when data has vanished underneath us */ -- if (info.seq != user->seq) -+ if (info.seq != atomic64_read(&user->seq)) - ret = EPOLLIN|EPOLLRDNORM|EPOLLERR|EPOLLPRI; - else - ret = EPOLLIN|EPOLLRDNORM; -@@ -906,7 +906,7 @@ static int devkmsg_open(struct inode *inode, struct file *file) - &user->text_buf[0], sizeof(user->text_buf)); - - logbuf_lock_irq(); -- user->seq = prb_first_valid_seq(prb); -+ atomic64_set(&user->seq, prb_first_valid_seq(prb)); - logbuf_unlock_irq(); - - file->private_data = user; --- -2.30.2 - diff --git a/debian/patches-rt/0090-printk-add-syslog_lock.patch b/debian/patches-rt/0090-printk-add-syslog_lock.patch deleted file mode 100644 index 2d9e7fe0b..000000000 --- a/debian/patches-rt/0090-printk-add-syslog_lock.patch +++ /dev/null @@ -1,159 +0,0 @@ -From 75053f8a4da50b1811d2dc6f47564b1cba24f789 Mon Sep 17 00:00:00 2001 -From: John Ogness <john.ogness@linutronix.de> -Date: Thu, 10 Dec 2020 16:58:02 +0106 -Subject: [PATCH 090/296] printk: add syslog_lock -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The global variables @syslog_seq, @syslog_partial, @syslog_time -and write access to @clear_seq are protected by @logbuf_lock. -Once @logbuf_lock is removed, these variables will need their -own synchronization method. Introduce @syslog_lock for this -purpose. - -@syslog_lock is a raw_spin_lock for now. This simplifies the -transition to removing @logbuf_lock. Once @logbuf_lock and the -safe buffers are removed, @syslog_lock can change to spin_lock. - -Signed-off-by: John Ogness <john.ogness@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/printk/printk.c | 41 +++++++++++++++++++++++++++++++++++++---- - 1 file changed, 37 insertions(+), 4 deletions(-) - -diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c -index 17631a396c92..75c2f697ee1c 100644 ---- a/kernel/printk/printk.c -+++ b/kernel/printk/printk.c -@@ -390,8 +390,12 @@ DEFINE_RAW_SPINLOCK(logbuf_lock); - printk_safe_exit_irqrestore(flags); \ - } while (0) - -+/* syslog_lock protects syslog_* variables and write access to clear_seq. */ -+static DEFINE_RAW_SPINLOCK(syslog_lock); -+ - #ifdef CONFIG_PRINTK - DECLARE_WAIT_QUEUE_HEAD(log_wait); -+/* All 3 protected by @syslog_lock. */ - /* the next printk record to read by syslog(READ) or /proc/kmsg */ - static u64 syslog_seq; - static size_t syslog_partial; -@@ -410,7 +414,7 @@ struct latched_seq { - /* - * The next printk record to read after the last 'clear' command. There are - * two copies (updated with seqcount_latch) so that reads can locklessly -- * access a valid value. Writers are synchronized by @logbuf_lock. -+ * access a valid value. Writers are synchronized by @syslog_lock. - */ - static struct latched_seq clear_seq = { - .latch = SEQCNT_LATCH_ZERO(clear_seq.latch), -@@ -470,7 +474,7 @@ bool printk_percpu_data_ready(void) - return __printk_percpu_data_ready; - } - --/* Must be called under logbuf_lock. */ -+/* Must be called under syslog_lock. */ - static void latched_seq_write(struct latched_seq *ls, u64 val) - { - raw_write_seqcount_latch(&ls->latch); -@@ -1530,7 +1534,9 @@ static int syslog_print(char __user *buf, int size) - size_t skip; - - logbuf_lock_irq(); -+ raw_spin_lock(&syslog_lock); - if (!prb_read_valid(prb, syslog_seq, &r)) { -+ raw_spin_unlock(&syslog_lock); - logbuf_unlock_irq(); - break; - } -@@ -1560,6 +1566,7 @@ static int syslog_print(char __user *buf, int size) - syslog_partial += n; - } else - n = 0; -+ raw_spin_unlock(&syslog_lock); - logbuf_unlock_irq(); - - if (!n) -@@ -1626,8 +1633,11 @@ static int syslog_print_all(char __user *buf, int size, bool clear) - break; - } - -- if (clear) -+ if (clear) { -+ raw_spin_lock(&syslog_lock); - latched_seq_write(&clear_seq, seq); -+ raw_spin_unlock(&syslog_lock); -+ } - logbuf_unlock_irq(); - - kfree(text); -@@ -1637,10 +1647,24 @@ static int syslog_print_all(char __user *buf, int size, bool clear) - static void syslog_clear(void) - { - logbuf_lock_irq(); -+ raw_spin_lock(&syslog_lock); - latched_seq_write(&clear_seq, prb_next_seq(prb)); -+ raw_spin_unlock(&syslog_lock); - logbuf_unlock_irq(); - } - -+/* Return a consistent copy of @syslog_seq. */ -+static u64 read_syslog_seq_irq(void) -+{ -+ u64 seq; -+ -+ raw_spin_lock_irq(&syslog_lock); -+ seq = syslog_seq; -+ raw_spin_unlock_irq(&syslog_lock); -+ -+ return seq; -+} -+ - int do_syslog(int type, char __user *buf, int len, int source) - { - struct printk_info info; -@@ -1664,8 +1688,9 @@ int do_syslog(int type, char __user *buf, int len, int source) - return 0; - if (!access_ok(buf, len)) - return -EFAULT; -+ - error = wait_event_interruptible(log_wait, -- prb_read_valid(prb, syslog_seq, NULL)); -+ prb_read_valid(prb, read_syslog_seq_irq(), NULL)); - if (error) - return error; - error = syslog_print(buf, len); -@@ -1714,8 +1739,10 @@ int do_syslog(int type, char __user *buf, int len, int source) - /* Number of chars in the log buffer */ - case SYSLOG_ACTION_SIZE_UNREAD: - logbuf_lock_irq(); -+ raw_spin_lock(&syslog_lock); - if (!prb_read_valid_info(prb, syslog_seq, &info, NULL)) { - /* No unread messages. */ -+ raw_spin_unlock(&syslog_lock); - logbuf_unlock_irq(); - return 0; - } -@@ -1744,6 +1771,7 @@ int do_syslog(int type, char __user *buf, int len, int source) - } - error -= syslog_partial; - } -+ raw_spin_unlock(&syslog_lock); - logbuf_unlock_irq(); - break; - /* Size of the log buffer */ -@@ -2986,7 +3014,12 @@ void register_console(struct console *newcon) - */ - exclusive_console = newcon; - exclusive_console_stop_seq = console_seq; -+ -+ /* Get a consistent copy of @syslog_seq. */ -+ raw_spin_lock(&syslog_lock); - console_seq = syslog_seq; -+ raw_spin_unlock(&syslog_lock); -+ - logbuf_unlock_irqrestore(flags); - } - console_unlock(); --- -2.30.2 - diff --git a/debian/patches-rt/0091-printk-introduce-a-kmsg_dump-iterator.patch b/debian/patches-rt/0091-printk-introduce-a-kmsg_dump-iterator.patch deleted file mode 100644 index e32af2227..000000000 --- a/debian/patches-rt/0091-printk-introduce-a-kmsg_dump-iterator.patch +++ /dev/null @@ -1,561 +0,0 @@ -From 9f1cdce8cd6d77113ba1d79160bbe14e026b0b0f Mon Sep 17 00:00:00 2001 -From: John Ogness <john.ogness@linutronix.de> -Date: Fri, 18 Dec 2020 11:40:08 +0000 -Subject: [PATCH 091/296] printk: introduce a kmsg_dump iterator -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Rather than store the iterator information into the registered -kmsg_dump structure, create a separate iterator structure. The -kmsg_dump_iter structure can reside on the stack of the caller, -thus allowing lockless use of the kmsg_dump functions. - -This is in preparation for removal of @logbuf_lock. - -Signed-off-by: John Ogness <john.ogness@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/powerpc/kernel/nvram_64.c | 12 ++-- - arch/powerpc/platforms/powernv/opal-kmsg.c | 3 +- - arch/powerpc/xmon/xmon.c | 6 +- - arch/um/kernel/kmsg_dump.c | 5 +- - drivers/hv/vmbus_drv.c | 5 +- - drivers/mtd/mtdoops.c | 5 +- - fs/pstore/platform.c | 5 +- - include/linux/kmsg_dump.h | 43 +++++++------- - kernel/debug/kdb/kdb_main.c | 10 ++-- - kernel/printk/printk.c | 65 +++++++++++----------- - 10 files changed, 84 insertions(+), 75 deletions(-) - -diff --git a/arch/powerpc/kernel/nvram_64.c b/arch/powerpc/kernel/nvram_64.c -index 532f22637783..1ef55f4b389a 100644 ---- a/arch/powerpc/kernel/nvram_64.c -+++ b/arch/powerpc/kernel/nvram_64.c -@@ -73,7 +73,8 @@ static const char *nvram_os_partitions[] = { - }; - - static void oops_to_nvram(struct kmsg_dumper *dumper, -- enum kmsg_dump_reason reason); -+ enum kmsg_dump_reason reason, -+ struct kmsg_dumper_iter *iter); - - static struct kmsg_dumper nvram_kmsg_dumper = { - .dump = oops_to_nvram -@@ -643,7 +644,8 @@ void __init nvram_init_oops_partition(int rtas_partition_exists) - * partition. If that's too much, go back and capture uncompressed text. - */ - static void oops_to_nvram(struct kmsg_dumper *dumper, -- enum kmsg_dump_reason reason) -+ enum kmsg_dump_reason reason, -+ struct kmsg_dumper_iter *iter) - { - struct oops_log_info *oops_hdr = (struct oops_log_info *)oops_buf; - static unsigned int oops_count = 0; -@@ -681,13 +683,13 @@ static void oops_to_nvram(struct kmsg_dumper *dumper, - return; - - if (big_oops_buf) { -- kmsg_dump_get_buffer(dumper, false, -+ kmsg_dump_get_buffer(iter, false, - big_oops_buf, big_oops_buf_sz, &text_len); - rc = zip_oops(text_len); - } - if (rc != 0) { -- kmsg_dump_rewind(dumper); -- kmsg_dump_get_buffer(dumper, false, -+ kmsg_dump_rewind(iter); -+ kmsg_dump_get_buffer(iter, false, - oops_data, oops_data_sz, &text_len); - err_type = ERR_TYPE_KERNEL_PANIC; - oops_hdr->version = cpu_to_be16(OOPS_HDR_VERSION); -diff --git a/arch/powerpc/platforms/powernv/opal-kmsg.c b/arch/powerpc/platforms/powernv/opal-kmsg.c -index 6c3bc4b4da98..ec862846bc82 100644 ---- a/arch/powerpc/platforms/powernv/opal-kmsg.c -+++ b/arch/powerpc/platforms/powernv/opal-kmsg.c -@@ -20,7 +20,8 @@ - * message, it just ensures that OPAL completely flushes the console buffer. - */ - static void kmsg_dump_opal_console_flush(struct kmsg_dumper *dumper, -- enum kmsg_dump_reason reason) -+ enum kmsg_dump_reason reason, -+ struct kmsg_dumper_iter *iter) - { - /* - * Outside of a panic context the pollers will continue to run, -diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c -index 5559edf36756..b71690e8245b 100644 ---- a/arch/powerpc/xmon/xmon.c -+++ b/arch/powerpc/xmon/xmon.c -@@ -3005,7 +3005,7 @@ print_address(unsigned long addr) - static void - dump_log_buf(void) - { -- struct kmsg_dumper dumper = { .active = 1 }; -+ struct kmsg_dumper_iter iter = { .active = 1 }; - unsigned char buf[128]; - size_t len; - -@@ -3017,9 +3017,9 @@ dump_log_buf(void) - catch_memory_errors = 1; - sync(); - -- kmsg_dump_rewind_nolock(&dumper); -+ kmsg_dump_rewind_nolock(&iter); - xmon_start_pagination(); -- while (kmsg_dump_get_line_nolock(&dumper, false, buf, sizeof(buf), &len)) { -+ while (kmsg_dump_get_line_nolock(&iter, false, buf, sizeof(buf), &len)) { - buf[len] = '\0'; - printf("%s", buf); - } -diff --git a/arch/um/kernel/kmsg_dump.c b/arch/um/kernel/kmsg_dump.c -index e4abac6c9727..f38349ad00ea 100644 ---- a/arch/um/kernel/kmsg_dump.c -+++ b/arch/um/kernel/kmsg_dump.c -@@ -6,7 +6,8 @@ - #include <os.h> - - static void kmsg_dumper_stdout(struct kmsg_dumper *dumper, -- enum kmsg_dump_reason reason) -+ enum kmsg_dump_reason reason, -+ struct kmsg_dumper_iter *iter) - { - static char line[1024]; - struct console *con; -@@ -25,7 +26,7 @@ static void kmsg_dumper_stdout(struct kmsg_dumper *dumper, - return; - - printf("kmsg_dump:\n"); -- while (kmsg_dump_get_line(dumper, true, line, sizeof(line), &len)) { -+ while (kmsg_dump_get_line(iter, true, line, sizeof(line), &len)) { - line[len] = '\0'; - printf("%s", line); - } -diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c -index a5a402e776c7..d5cc74ecc582 100644 ---- a/drivers/hv/vmbus_drv.c -+++ b/drivers/hv/vmbus_drv.c -@@ -1359,7 +1359,8 @@ static void vmbus_isr(void) - * buffer and call into Hyper-V to transfer the data. - */ - static void hv_kmsg_dump(struct kmsg_dumper *dumper, -- enum kmsg_dump_reason reason) -+ enum kmsg_dump_reason reason, -+ struct kmsg_dumper_iter *iter) - { - size_t bytes_written; - phys_addr_t panic_pa; -@@ -1374,7 +1375,7 @@ static void hv_kmsg_dump(struct kmsg_dumper *dumper, - * Write dump contents to the page. No need to synchronize; panic should - * be single-threaded. - */ -- kmsg_dump_get_buffer(dumper, false, hv_panic_page, HV_HYP_PAGE_SIZE, -+ kmsg_dump_get_buffer(iter, false, hv_panic_page, HV_HYP_PAGE_SIZE, - &bytes_written); - if (bytes_written) - hyperv_report_panic_msg(panic_pa, bytes_written); -diff --git a/drivers/mtd/mtdoops.c b/drivers/mtd/mtdoops.c -index 774970bfcf85..6bc2c728adb7 100644 ---- a/drivers/mtd/mtdoops.c -+++ b/drivers/mtd/mtdoops.c -@@ -267,7 +267,8 @@ static void find_next_position(struct mtdoops_context *cxt) - } - - static void mtdoops_do_dump(struct kmsg_dumper *dumper, -- enum kmsg_dump_reason reason) -+ enum kmsg_dump_reason reason, -+ struct kmsg_dumper_iter *iter) - { - struct mtdoops_context *cxt = container_of(dumper, - struct mtdoops_context, dump); -@@ -276,7 +277,7 @@ static void mtdoops_do_dump(struct kmsg_dumper *dumper, - if (reason == KMSG_DUMP_OOPS && !dump_oops) - return; - -- kmsg_dump_get_buffer(dumper, true, cxt->oops_buf + MTDOOPS_HEADER_SIZE, -+ kmsg_dump_get_buffer(iter, true, cxt->oops_buf + MTDOOPS_HEADER_SIZE, - record_size - MTDOOPS_HEADER_SIZE, NULL); - - if (reason != KMSG_DUMP_OOPS) { -diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c -index b1ebf7b61732..b7e3a6bacbb0 100644 ---- a/fs/pstore/platform.c -+++ b/fs/pstore/platform.c -@@ -383,7 +383,8 @@ void pstore_record_init(struct pstore_record *record, - * end of the buffer. - */ - static void pstore_dump(struct kmsg_dumper *dumper, -- enum kmsg_dump_reason reason) -+ enum kmsg_dump_reason reason, -+ struct kmsg_dumper_iter *iter) - { - unsigned long total = 0; - const char *why; -@@ -435,7 +436,7 @@ static void pstore_dump(struct kmsg_dumper *dumper, - dst_size -= header_size; - - /* Write dump contents. */ -- if (!kmsg_dump_get_buffer(dumper, true, dst + header_size, -+ if (!kmsg_dump_get_buffer(iter, true, dst + header_size, - dst_size, &dump_size)) - break; - -diff --git a/include/linux/kmsg_dump.h b/include/linux/kmsg_dump.h -index 4095a34db0fa..2fdb10ab1799 100644 ---- a/include/linux/kmsg_dump.h -+++ b/include/linux/kmsg_dump.h -@@ -29,6 +29,18 @@ enum kmsg_dump_reason { - KMSG_DUMP_MAX - }; - -+/** -+ * struct kmsg_dumper_iter - iterator for kernel crash message dumper -+ * @active: Flag that specifies if this is currently dumping -+ * @cur_seq: Points to the oldest message to dump (private) -+ * @next_seq: Points after the newest message to dump (private) -+ */ -+struct kmsg_dumper_iter { -+ bool active; -+ u64 cur_seq; -+ u64 next_seq; -+}; -+ - /** - * struct kmsg_dumper - kernel crash message dumper structure - * @list: Entry in the dumper list (private) -@@ -36,37 +48,30 @@ enum kmsg_dump_reason { - * through the record iterator - * @max_reason: filter for highest reason number that should be dumped - * @registered: Flag that specifies if this is already registered -- * @active: Flag that specifies if this is currently dumping -- * @cur_seq: Points to the oldest message to dump (private) -- * @next_seq: Points after the newest message to dump (private) - */ - struct kmsg_dumper { - struct list_head list; -- void (*dump)(struct kmsg_dumper *dumper, enum kmsg_dump_reason reason); -+ void (*dump)(struct kmsg_dumper *dumper, enum kmsg_dump_reason reason, -+ struct kmsg_dumper_iter *iter); - enum kmsg_dump_reason max_reason; -- bool active; - bool registered; -- -- /* private state of the kmsg iterator */ -- u64 cur_seq; -- u64 next_seq; - }; - - #ifdef CONFIG_PRINTK - void kmsg_dump(enum kmsg_dump_reason reason); - --bool kmsg_dump_get_line_nolock(struct kmsg_dumper *dumper, bool syslog, -+bool kmsg_dump_get_line_nolock(struct kmsg_dumper_iter *iter, bool syslog, - char *line, size_t size, size_t *len); - --bool kmsg_dump_get_line(struct kmsg_dumper *dumper, bool syslog, -+bool kmsg_dump_get_line(struct kmsg_dumper_iter *iter, bool syslog, - char *line, size_t size, size_t *len); - --bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog, -+bool kmsg_dump_get_buffer(struct kmsg_dumper_iter *iter, bool syslog, - char *buf, size_t size, size_t *len_out); - --void kmsg_dump_rewind_nolock(struct kmsg_dumper *dumper); -+void kmsg_dump_rewind_nolock(struct kmsg_dumper_iter *iter); - --void kmsg_dump_rewind(struct kmsg_dumper *dumper); -+void kmsg_dump_rewind(struct kmsg_dumper_iter *dumper_iter); - - int kmsg_dump_register(struct kmsg_dumper *dumper); - -@@ -78,30 +83,30 @@ static inline void kmsg_dump(enum kmsg_dump_reason reason) - { - } - --static inline bool kmsg_dump_get_line_nolock(struct kmsg_dumper *dumper, -+static inline bool kmsg_dump_get_line_nolock(struct kmsg_dumper_iter *iter, - bool syslog, const char *line, - size_t size, size_t *len) - { - return false; - } - --static inline bool kmsg_dump_get_line(struct kmsg_dumper *dumper, bool syslog, -+static inline bool kmsg_dump_get_line(struct kmsg_dumper_iter *iter, bool syslog, - const char *line, size_t size, size_t *len) - { - return false; - } - --static inline bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog, -+static inline bool kmsg_dump_get_buffer(struct kmsg_dumper_iter *iter, bool syslog, - char *buf, size_t size, size_t *len) - { - return false; - } - --static inline void kmsg_dump_rewind_nolock(struct kmsg_dumper *dumper) -+static inline void kmsg_dump_rewind_nolock(struct kmsg_dumper_iter *iter) - { - } - --static inline void kmsg_dump_rewind(struct kmsg_dumper *dumper) -+static inline void kmsg_dump_rewind(struct kmsg_dumper_iter *iter) - { - } - -diff --git a/kernel/debug/kdb/kdb_main.c b/kernel/debug/kdb/kdb_main.c -index 930ac1b25ec7..7ae9da245e4b 100644 ---- a/kernel/debug/kdb/kdb_main.c -+++ b/kernel/debug/kdb/kdb_main.c -@@ -2101,7 +2101,7 @@ static int kdb_dmesg(int argc, const char **argv) - int adjust = 0; - int n = 0; - int skip = 0; -- struct kmsg_dumper dumper = { .active = 1 }; -+ struct kmsg_dumper_iter iter = { .active = 1 }; - size_t len; - char buf[201]; - -@@ -2126,8 +2126,8 @@ static int kdb_dmesg(int argc, const char **argv) - kdb_set(2, setargs); - } - -- kmsg_dump_rewind_nolock(&dumper); -- while (kmsg_dump_get_line_nolock(&dumper, 1, NULL, 0, NULL)) -+ kmsg_dump_rewind_nolock(&iter); -+ while (kmsg_dump_get_line_nolock(&iter, 1, NULL, 0, NULL)) - n++; - - if (lines < 0) { -@@ -2159,8 +2159,8 @@ static int kdb_dmesg(int argc, const char **argv) - if (skip >= n || skip < 0) - return 0; - -- kmsg_dump_rewind_nolock(&dumper); -- while (kmsg_dump_get_line_nolock(&dumper, 1, buf, sizeof(buf), &len)) { -+ kmsg_dump_rewind_nolock(&iter); -+ while (kmsg_dump_get_line_nolock(&iter, 1, buf, sizeof(buf), &len)) { - if (skip) { - skip--; - continue; -diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c -index 75c2f697ee1c..6c611328f217 100644 ---- a/kernel/printk/printk.c -+++ b/kernel/printk/printk.c -@@ -3385,6 +3385,7 @@ EXPORT_SYMBOL_GPL(kmsg_dump_reason_str); - */ - void kmsg_dump(enum kmsg_dump_reason reason) - { -+ struct kmsg_dumper_iter iter; - struct kmsg_dumper *dumper; - unsigned long flags; - -@@ -3404,25 +3405,21 @@ void kmsg_dump(enum kmsg_dump_reason reason) - continue; - - /* initialize iterator with data about the stored records */ -- dumper->active = true; -- -+ iter.active = true; - logbuf_lock_irqsave(flags); -- dumper->cur_seq = latched_seq_read_nolock(&clear_seq); -- dumper->next_seq = prb_next_seq(prb); -+ iter.cur_seq = latched_seq_read_nolock(&clear_seq); -+ iter.next_seq = prb_next_seq(prb); - logbuf_unlock_irqrestore(flags); - - /* invoke dumper which will iterate over records */ -- dumper->dump(dumper, reason); -- -- /* reset iterator */ -- dumper->active = false; -+ dumper->dump(dumper, reason, &iter); - } - rcu_read_unlock(); - } - - /** - * kmsg_dump_get_line_nolock - retrieve one kmsg log line (unlocked version) -- * @dumper: registered kmsg dumper -+ * @iter: kmsg dumper iterator - * @syslog: include the "<4>" prefixes - * @line: buffer to copy the line to - * @size: maximum size of the buffer -@@ -3439,7 +3436,7 @@ void kmsg_dump(enum kmsg_dump_reason reason) - * - * The function is similar to kmsg_dump_get_line(), but grabs no locks. - */ --bool kmsg_dump_get_line_nolock(struct kmsg_dumper *dumper, bool syslog, -+bool kmsg_dump_get_line_nolock(struct kmsg_dumper_iter *iter, bool syslog, - char *line, size_t size, size_t *len) - { - struct printk_info info; -@@ -3450,16 +3447,16 @@ bool kmsg_dump_get_line_nolock(struct kmsg_dumper *dumper, bool syslog, - - prb_rec_init_rd(&r, &info, line, size); - -- if (!dumper->active) -+ if (!iter->active) - goto out; - - /* Read text or count text lines? */ - if (line) { -- if (!prb_read_valid(prb, dumper->cur_seq, &r)) -+ if (!prb_read_valid(prb, iter->cur_seq, &r)) - goto out; - l = record_print_text(&r, syslog, printk_time); - } else { -- if (!prb_read_valid_info(prb, dumper->cur_seq, -+ if (!prb_read_valid_info(prb, iter->cur_seq, - &info, &line_count)) { - goto out; - } -@@ -3468,7 +3465,7 @@ bool kmsg_dump_get_line_nolock(struct kmsg_dumper *dumper, bool syslog, - - } - -- dumper->cur_seq = r.info->seq + 1; -+ iter->cur_seq = r.info->seq + 1; - ret = true; - out: - if (len) -@@ -3478,7 +3475,7 @@ bool kmsg_dump_get_line_nolock(struct kmsg_dumper *dumper, bool syslog, - - /** - * kmsg_dump_get_line - retrieve one kmsg log line -- * @dumper: registered kmsg dumper -+ * @iter: kmsg dumper iterator - * @syslog: include the "<4>" prefixes - * @line: buffer to copy the line to - * @size: maximum size of the buffer -@@ -3493,14 +3490,14 @@ bool kmsg_dump_get_line_nolock(struct kmsg_dumper *dumper, bool syslog, - * A return value of FALSE indicates that there are no more records to - * read. - */ --bool kmsg_dump_get_line(struct kmsg_dumper *dumper, bool syslog, -+bool kmsg_dump_get_line(struct kmsg_dumper_iter *iter, bool syslog, - char *line, size_t size, size_t *len) - { - unsigned long flags; - bool ret; - - logbuf_lock_irqsave(flags); -- ret = kmsg_dump_get_line_nolock(dumper, syslog, line, size, len); -+ ret = kmsg_dump_get_line_nolock(iter, syslog, line, size, len); - logbuf_unlock_irqrestore(flags); - - return ret; -@@ -3509,7 +3506,7 @@ EXPORT_SYMBOL_GPL(kmsg_dump_get_line); - - /** - * kmsg_dump_get_buffer - copy kmsg log lines -- * @dumper: registered kmsg dumper -+ * @iter: kmsg dumper iterator - * @syslog: include the "<4>" prefixes - * @buf: buffer to copy the line to - * @size: maximum size of the buffer -@@ -3526,7 +3523,7 @@ EXPORT_SYMBOL_GPL(kmsg_dump_get_line); - * A return value of FALSE indicates that there are no more records to - * read. - */ --bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog, -+bool kmsg_dump_get_buffer(struct kmsg_dumper_iter *iter, bool syslog, - char *buf, size_t size, size_t *len_out) - { - struct printk_info info; -@@ -3538,19 +3535,19 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog, - bool ret = false; - bool time = printk_time; - -- if (!dumper->active || !buf || !size) -+ if (!iter->active || !buf || !size) - goto out; - - logbuf_lock_irqsave(flags); -- if (prb_read_valid_info(prb, dumper->cur_seq, &info, NULL)) { -- if (info.seq != dumper->cur_seq) { -+ if (prb_read_valid_info(prb, iter->cur_seq, &info, NULL)) { -+ if (info.seq != iter->cur_seq) { - /* messages are gone, move to first available one */ -- dumper->cur_seq = info.seq; -+ iter->cur_seq = info.seq; - } - } - - /* last entry */ -- if (dumper->cur_seq >= dumper->next_seq) { -+ if (iter->cur_seq >= iter->next_seq) { - logbuf_unlock_irqrestore(flags); - goto out; - } -@@ -3561,7 +3558,7 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog, - * because this function (by way of record_print_text()) will - * not write more than size-1 bytes of text into @buf. - */ -- seq = find_first_fitting_seq(dumper->cur_seq, dumper->next_seq, -+ seq = find_first_fitting_seq(iter->cur_seq, iter->next_seq, - size - 1, syslog, time); - - /* -@@ -3574,7 +3571,7 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog, - - len = 0; - prb_for_each_record(seq, prb, seq, &r) { -- if (r.info->seq >= dumper->next_seq) -+ if (r.info->seq >= iter->next_seq) - break; - - len += record_print_text(&r, syslog, time); -@@ -3583,7 +3580,7 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog, - prb_rec_init_rd(&r, &info, buf + len, size - len); - } - -- dumper->next_seq = next_seq; -+ iter->next_seq = next_seq; - ret = true; - logbuf_unlock_irqrestore(flags); - out: -@@ -3595,7 +3592,7 @@ EXPORT_SYMBOL_GPL(kmsg_dump_get_buffer); - - /** - * kmsg_dump_rewind_nolock - reset the iterator (unlocked version) -- * @dumper: registered kmsg dumper -+ * @iter: kmsg dumper iterator - * - * Reset the dumper's iterator so that kmsg_dump_get_line() and - * kmsg_dump_get_buffer() can be called again and used multiple -@@ -3603,26 +3600,26 @@ EXPORT_SYMBOL_GPL(kmsg_dump_get_buffer); - * - * The function is similar to kmsg_dump_rewind(), but grabs no locks. - */ --void kmsg_dump_rewind_nolock(struct kmsg_dumper *dumper) -+void kmsg_dump_rewind_nolock(struct kmsg_dumper_iter *iter) - { -- dumper->cur_seq = latched_seq_read_nolock(&clear_seq); -- dumper->next_seq = prb_next_seq(prb); -+ iter->cur_seq = latched_seq_read_nolock(&clear_seq); -+ iter->next_seq = prb_next_seq(prb); - } - - /** - * kmsg_dump_rewind - reset the iterator -- * @dumper: registered kmsg dumper -+ * @iter: kmsg dumper iterator - * - * Reset the dumper's iterator so that kmsg_dump_get_line() and - * kmsg_dump_get_buffer() can be called again and used multiple - * times within the same dumper.dump() callback. - */ --void kmsg_dump_rewind(struct kmsg_dumper *dumper) -+void kmsg_dump_rewind(struct kmsg_dumper_iter *iter) - { - unsigned long flags; - - logbuf_lock_irqsave(flags); -- kmsg_dump_rewind_nolock(dumper); -+ kmsg_dump_rewind_nolock(iter); - logbuf_unlock_irqrestore(flags); - } - EXPORT_SYMBOL_GPL(kmsg_dump_rewind); --- -2.30.2 - diff --git a/debian/patches-rt/0092-um-synchronize-kmsg_dumper.patch b/debian/patches-rt/0092-um-synchronize-kmsg_dumper.patch deleted file mode 100644 index e7860e328..000000000 --- a/debian/patches-rt/0092-um-synchronize-kmsg_dumper.patch +++ /dev/null @@ -1,61 +0,0 @@ -From 0b0182ece4424edffd8b5ade5cdabecc1ff202cb Mon Sep 17 00:00:00 2001 -From: John Ogness <john.ogness@linutronix.de> -Date: Mon, 21 Dec 2020 11:10:03 +0106 -Subject: [PATCH 092/296] um: synchronize kmsg_dumper -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The kmsg_dumper can be called from any context and CPU, possibly -from multiple CPUs simultaneously. Since a static buffer is used -to retrieve the kernel logs, this buffer must be protected against -simultaneous dumping. - -Cc: Richard Weinberger <richard@nod.at> -Signed-off-by: John Ogness <john.ogness@linutronix.de> -Reviewed-by: Petr Mladek <pmladek@suse.com> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/um/kernel/kmsg_dump.c | 8 ++++++++ - 1 file changed, 8 insertions(+) - -diff --git a/arch/um/kernel/kmsg_dump.c b/arch/um/kernel/kmsg_dump.c -index f38349ad00ea..173999422ed8 100644 ---- a/arch/um/kernel/kmsg_dump.c -+++ b/arch/um/kernel/kmsg_dump.c -@@ -1,5 +1,6 @@ - // SPDX-License-Identifier: GPL-2.0 - #include <linux/kmsg_dump.h> -+#include <linux/spinlock.h> - #include <linux/console.h> - #include <shared/init.h> - #include <shared/kern.h> -@@ -9,8 +10,10 @@ static void kmsg_dumper_stdout(struct kmsg_dumper *dumper, - enum kmsg_dump_reason reason, - struct kmsg_dumper_iter *iter) - { -+ static DEFINE_SPINLOCK(lock); - static char line[1024]; - struct console *con; -+ unsigned long flags; - size_t len = 0; - - /* only dump kmsg when no console is available */ -@@ -25,11 +28,16 @@ static void kmsg_dumper_stdout(struct kmsg_dumper *dumper, - if (con) - return; - -+ if (!spin_trylock_irqsave(&lock, flags)) -+ return; -+ - printf("kmsg_dump:\n"); - while (kmsg_dump_get_line(iter, true, line, sizeof(line), &len)) { - line[len] = '\0'; - printf("%s", line); - } -+ -+ spin_unlock_irqrestore(&lock, flags); - } - - static struct kmsg_dumper kmsg_dumper = { --- -2.30.2 - diff --git a/debian/patches-rt/0093-printk-remove-logbuf_lock.patch b/debian/patches-rt/0093-printk-remove-logbuf_lock.patch deleted file mode 100644 index ccc4eb47f..000000000 --- a/debian/patches-rt/0093-printk-remove-logbuf_lock.patch +++ /dev/null @@ -1,486 +0,0 @@ -From 261bf1af359dc306afd111957b682bd7cc74741b Mon Sep 17 00:00:00 2001 -From: John Ogness <john.ogness@linutronix.de> -Date: Tue, 26 Jan 2021 17:43:19 +0106 -Subject: [PATCH 093/296] printk: remove logbuf_lock -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Since the ringbuffer is lockless, there is no need for it to be -protected by @logbuf_lock. Remove @logbuf_lock. - -This means that printk_nmi_direct and printk_safe_flush_on_panic() -no longer need to acquire any lock to run. - -@console_seq, @exclusive_console_stop_seq, @console_dropped are -protected by @console_lock. - -Signed-off-by: John Ogness <john.ogness@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/printk/internal.h | 4 +- - kernel/printk/printk.c | 118 ++++++++++++------------------------ - kernel/printk/printk_safe.c | 29 +++------ - 3 files changed, 48 insertions(+), 103 deletions(-) - -diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h -index 3a8fd491758c..e7acc2888c8e 100644 ---- a/kernel/printk/internal.h -+++ b/kernel/printk/internal.h -@@ -12,8 +12,6 @@ - - #define PRINTK_NMI_CONTEXT_OFFSET 0x010000000 - --extern raw_spinlock_t logbuf_lock; -- - __printf(4, 0) - int vprintk_store(int facility, int level, - const struct dev_printk_info *dev_info, -@@ -59,7 +57,7 @@ void defer_console_output(void); - __printf(1, 0) int vprintk_func(const char *fmt, va_list args) { return 0; } - - /* -- * In !PRINTK builds we still export logbuf_lock spin_lock, console_sem -+ * In !PRINTK builds we still export console_sem - * semaphore and some of console functions (console_unlock()/etc.), so - * printk-safe must preserve the existing local IRQ guarantees. - */ -diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c -index 6c611328f217..d4e3111e3bc0 100644 ---- a/kernel/printk/printk.c -+++ b/kernel/printk/printk.c -@@ -355,41 +355,6 @@ enum log_flags { - LOG_CONT = 8, /* text is a fragment of a continuation line */ - }; - --/* -- * The logbuf_lock protects kmsg buffer, indices, counters. This can be taken -- * within the scheduler's rq lock. It must be released before calling -- * console_unlock() or anything else that might wake up a process. -- */ --DEFINE_RAW_SPINLOCK(logbuf_lock); -- --/* -- * Helper macros to lock/unlock logbuf_lock and switch between -- * printk-safe/unsafe modes. -- */ --#define logbuf_lock_irq() \ -- do { \ -- printk_safe_enter_irq(); \ -- raw_spin_lock(&logbuf_lock); \ -- } while (0) -- --#define logbuf_unlock_irq() \ -- do { \ -- raw_spin_unlock(&logbuf_lock); \ -- printk_safe_exit_irq(); \ -- } while (0) -- --#define logbuf_lock_irqsave(flags) \ -- do { \ -- printk_safe_enter_irqsave(flags); \ -- raw_spin_lock(&logbuf_lock); \ -- } while (0) -- --#define logbuf_unlock_irqrestore(flags) \ -- do { \ -- raw_spin_unlock(&logbuf_lock); \ -- printk_safe_exit_irqrestore(flags); \ -- } while (0) -- - /* syslog_lock protects syslog_* variables and write access to clear_seq. */ - static DEFINE_RAW_SPINLOCK(syslog_lock); - -@@ -401,6 +366,7 @@ static u64 syslog_seq; - static size_t syslog_partial; - static bool syslog_time; - -+/* All 3 protected by @console_sem. */ - /* the next printk record to write to the console */ - static u64 console_seq; - static u64 exclusive_console_stop_seq; -@@ -767,27 +733,27 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf, - if (ret) - return ret; - -- logbuf_lock_irq(); -+ printk_safe_enter_irq(); - if (!prb_read_valid(prb, atomic64_read(&user->seq), r)) { - if (file->f_flags & O_NONBLOCK) { - ret = -EAGAIN; -- logbuf_unlock_irq(); -+ printk_safe_exit_irq(); - goto out; - } - -- logbuf_unlock_irq(); -+ printk_safe_exit_irq(); - ret = wait_event_interruptible(log_wait, - prb_read_valid(prb, atomic64_read(&user->seq), r)); - if (ret) - goto out; -- logbuf_lock_irq(); -+ printk_safe_enter_irq(); - } - - if (r->info->seq != atomic64_read(&user->seq)) { - /* our last seen message is gone, return error and reset */ - atomic64_set(&user->seq, r->info->seq); - ret = -EPIPE; -- logbuf_unlock_irq(); -+ printk_safe_exit_irq(); - goto out; - } - -@@ -797,7 +763,7 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf, - &r->info->dev_info); - - atomic64_set(&user->seq, r->info->seq + 1); -- logbuf_unlock_irq(); -+ printk_safe_exit_irq(); - - if (len > count) { - ret = -EINVAL; -@@ -832,7 +798,7 @@ static loff_t devkmsg_llseek(struct file *file, loff_t offset, int whence) - if (offset) - return -ESPIPE; - -- logbuf_lock_irq(); -+ printk_safe_enter_irq(); - switch (whence) { - case SEEK_SET: - /* the first record */ -@@ -853,7 +819,7 @@ static loff_t devkmsg_llseek(struct file *file, loff_t offset, int whence) - default: - ret = -EINVAL; - } -- logbuf_unlock_irq(); -+ printk_safe_exit_irq(); - return ret; - } - -@@ -868,15 +834,15 @@ static __poll_t devkmsg_poll(struct file *file, poll_table *wait) - - poll_wait(file, &log_wait, wait); - -- logbuf_lock_irq(); -- if (prb_read_valid_info(prb, user->seq, &info, NULL)) { -+ printk_safe_enter_irq(); -+ if (prb_read_valid_info(prb, atomic64_read(&user->seq), &info, NULL)) { - /* return error when data has vanished underneath us */ - if (info.seq != atomic64_read(&user->seq)) - ret = EPOLLIN|EPOLLRDNORM|EPOLLERR|EPOLLPRI; - else - ret = EPOLLIN|EPOLLRDNORM; - } -- logbuf_unlock_irq(); -+ printk_safe_exit_irq(); - - return ret; - } -@@ -909,9 +875,9 @@ static int devkmsg_open(struct inode *inode, struct file *file) - prb_rec_init_rd(&user->record, &user->info, - &user->text_buf[0], sizeof(user->text_buf)); - -- logbuf_lock_irq(); -+ printk_safe_enter_irq(); - atomic64_set(&user->seq, prb_first_valid_seq(prb)); -- logbuf_unlock_irq(); -+ printk_safe_exit_irq(); - - file->private_data = user; - return 0; -@@ -1533,11 +1499,11 @@ static int syslog_print(char __user *buf, int size) - size_t n; - size_t skip; - -- logbuf_lock_irq(); -+ printk_safe_enter_irq(); - raw_spin_lock(&syslog_lock); - if (!prb_read_valid(prb, syslog_seq, &r)) { - raw_spin_unlock(&syslog_lock); -- logbuf_unlock_irq(); -+ printk_safe_exit_irq(); - break; - } - if (r.info->seq != syslog_seq) { -@@ -1567,7 +1533,7 @@ static int syslog_print(char __user *buf, int size) - } else - n = 0; - raw_spin_unlock(&syslog_lock); -- logbuf_unlock_irq(); -+ printk_safe_exit_irq(); - - if (!n) - break; -@@ -1601,7 +1567,7 @@ static int syslog_print_all(char __user *buf, int size, bool clear) - return -ENOMEM; - - time = printk_time; -- logbuf_lock_irq(); -+ printk_safe_enter_irq(); - /* - * Find first record that fits, including all following records, - * into the user-provided buffer for this dump. -@@ -1622,12 +1588,12 @@ static int syslog_print_all(char __user *buf, int size, bool clear) - break; - } - -- logbuf_unlock_irq(); -+ printk_safe_exit_irq(); - if (copy_to_user(buf + len, text, textlen)) - len = -EFAULT; - else - len += textlen; -- logbuf_lock_irq(); -+ printk_safe_enter_irq(); - - if (len < 0) - break; -@@ -1638,7 +1604,7 @@ static int syslog_print_all(char __user *buf, int size, bool clear) - latched_seq_write(&clear_seq, seq); - raw_spin_unlock(&syslog_lock); - } -- logbuf_unlock_irq(); -+ printk_safe_exit_irq(); - - kfree(text); - return len; -@@ -1646,11 +1612,11 @@ static int syslog_print_all(char __user *buf, int size, bool clear) - - static void syslog_clear(void) - { -- logbuf_lock_irq(); -+ printk_safe_enter_irq(); - raw_spin_lock(&syslog_lock); - latched_seq_write(&clear_seq, prb_next_seq(prb)); - raw_spin_unlock(&syslog_lock); -- logbuf_unlock_irq(); -+ printk_safe_exit_irq(); - } - - /* Return a consistent copy of @syslog_seq. */ -@@ -1738,12 +1704,12 @@ int do_syslog(int type, char __user *buf, int len, int source) - break; - /* Number of chars in the log buffer */ - case SYSLOG_ACTION_SIZE_UNREAD: -- logbuf_lock_irq(); -+ printk_safe_enter_irq(); - raw_spin_lock(&syslog_lock); - if (!prb_read_valid_info(prb, syslog_seq, &info, NULL)) { - /* No unread messages. */ - raw_spin_unlock(&syslog_lock); -- logbuf_unlock_irq(); -+ printk_safe_exit_irq(); - return 0; - } - if (info.seq != syslog_seq) { -@@ -1772,7 +1738,7 @@ int do_syslog(int type, char __user *buf, int len, int source) - error -= syslog_partial; - } - raw_spin_unlock(&syslog_lock); -- logbuf_unlock_irq(); -+ printk_safe_exit_irq(); - break; - /* Size of the log buffer */ - case SYSLOG_ACTION_SIZE_BUFFER: -@@ -2621,7 +2587,6 @@ void console_unlock(void) - size_t len; - - printk_safe_enter_irqsave(flags); -- raw_spin_lock(&logbuf_lock); - skip: - if (!prb_read_valid(prb, console_seq, &r)) - break; -@@ -2665,7 +2630,6 @@ void console_unlock(void) - console_msg_format & MSG_FORMAT_SYSLOG, - printk_time); - console_seq++; -- raw_spin_unlock(&logbuf_lock); - - /* - * While actively printing out messages, if another printk() -@@ -2692,8 +2656,6 @@ void console_unlock(void) - - console_locked = 0; - -- raw_spin_unlock(&logbuf_lock); -- - up_console_sem(); - - /* -@@ -2702,9 +2664,7 @@ void console_unlock(void) - * there's a new owner and the console_unlock() from them will do the - * flush, no worries. - */ -- raw_spin_lock(&logbuf_lock); - retry = prb_read_valid(prb, console_seq, NULL); -- raw_spin_unlock(&logbuf_lock); - printk_safe_exit_irqrestore(flags); - - if (retry && console_trylock()) -@@ -2771,9 +2731,9 @@ void console_flush_on_panic(enum con_flush_mode mode) - if (mode == CONSOLE_REPLAY_ALL) { - unsigned long flags; - -- logbuf_lock_irqsave(flags); -+ printk_safe_enter_irqsave(flags); - console_seq = prb_first_valid_seq(prb); -- logbuf_unlock_irqrestore(flags); -+ printk_safe_exit_irqrestore(flags); - } - console_unlock(); - } -@@ -3002,7 +2962,7 @@ void register_console(struct console *newcon) - * console_unlock(); will print out the buffered messages - * for us. - */ -- logbuf_lock_irqsave(flags); -+ printk_safe_enter_irqsave(flags); - /* - * We're about to replay the log buffer. Only do this to the - * just-registered console to avoid excessive message spam to -@@ -3020,7 +2980,7 @@ void register_console(struct console *newcon) - console_seq = syslog_seq; - raw_spin_unlock(&syslog_lock); - -- logbuf_unlock_irqrestore(flags); -+ printk_safe_exit_irqrestore(flags); - } - console_unlock(); - console_sysfs_notify(); -@@ -3406,10 +3366,10 @@ void kmsg_dump(enum kmsg_dump_reason reason) - - /* initialize iterator with data about the stored records */ - iter.active = true; -- logbuf_lock_irqsave(flags); -+ printk_safe_enter_irqsave(flags); - iter.cur_seq = latched_seq_read_nolock(&clear_seq); - iter.next_seq = prb_next_seq(prb); -- logbuf_unlock_irqrestore(flags); -+ printk_safe_exit_irqrestore(flags); - - /* invoke dumper which will iterate over records */ - dumper->dump(dumper, reason, &iter); -@@ -3496,9 +3456,9 @@ bool kmsg_dump_get_line(struct kmsg_dumper_iter *iter, bool syslog, - unsigned long flags; - bool ret; - -- logbuf_lock_irqsave(flags); -+ printk_safe_enter_irqsave(flags); - ret = kmsg_dump_get_line_nolock(iter, syslog, line, size, len); -- logbuf_unlock_irqrestore(flags); -+ printk_safe_exit_irqrestore(flags); - - return ret; - } -@@ -3538,7 +3498,7 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper_iter *iter, bool syslog, - if (!iter->active || !buf || !size) - goto out; - -- logbuf_lock_irqsave(flags); -+ printk_safe_enter_irqsave(flags); - if (prb_read_valid_info(prb, iter->cur_seq, &info, NULL)) { - if (info.seq != iter->cur_seq) { - /* messages are gone, move to first available one */ -@@ -3548,7 +3508,7 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper_iter *iter, bool syslog, - - /* last entry */ - if (iter->cur_seq >= iter->next_seq) { -- logbuf_unlock_irqrestore(flags); -+ printk_safe_exit_irqrestore(flags); - goto out; - } - -@@ -3582,7 +3542,7 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper_iter *iter, bool syslog, - - iter->next_seq = next_seq; - ret = true; -- logbuf_unlock_irqrestore(flags); -+ printk_safe_exit_irqrestore(flags); - out: - if (len_out) - *len_out = len; -@@ -3618,9 +3578,9 @@ void kmsg_dump_rewind(struct kmsg_dumper_iter *iter) - { - unsigned long flags; - -- logbuf_lock_irqsave(flags); -+ printk_safe_enter_irqsave(flags); - kmsg_dump_rewind_nolock(iter); -- logbuf_unlock_irqrestore(flags); -+ printk_safe_exit_irqrestore(flags); - } - EXPORT_SYMBOL_GPL(kmsg_dump_rewind); - -diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c -index 2e9e3ed7d63e..7df8a88d4115 100644 ---- a/kernel/printk/printk_safe.c -+++ b/kernel/printk/printk_safe.c -@@ -16,7 +16,7 @@ - #include "internal.h" - - /* -- * printk() could not take logbuf_lock in NMI context. Instead, -+ * In NMI and safe mode, printk() avoids taking locks. Instead, - * it uses an alternative implementation that temporary stores - * the strings into a per-CPU buffer. The content of the buffer - * is later flushed into the main ring buffer via IRQ work. -@@ -266,18 +266,6 @@ void printk_safe_flush(void) - */ - void printk_safe_flush_on_panic(void) - { -- /* -- * Make sure that we could access the main ring buffer. -- * Do not risk a double release when more CPUs are up. -- */ -- if (raw_spin_is_locked(&logbuf_lock)) { -- if (num_online_cpus() > 1) -- return; -- -- debug_locks_off(); -- raw_spin_lock_init(&logbuf_lock); -- } -- - if (raw_spin_is_locked(&safe_read_lock)) { - if (num_online_cpus() > 1) - return; -@@ -319,9 +307,7 @@ void noinstr printk_nmi_exit(void) - * reordering. - * - * It has effect only when called in NMI context. Then printk() -- * will try to store the messages into the main logbuf directly -- * and use the per-CPU buffers only as a fallback when the lock -- * is not available. -+ * will store the messages into the main logbuf directly. - */ - void printk_nmi_direct_enter(void) - { -@@ -376,20 +362,21 @@ __printf(1, 0) int vprintk_func(const char *fmt, va_list args) - #endif - - /* -- * Try to use the main logbuf even in NMI. But avoid calling console -+ * Use the main logbuf even in NMI. But avoid calling console - * drivers that might have their own locks. - */ -- if ((this_cpu_read(printk_context) & PRINTK_NMI_DIRECT_CONTEXT_MASK) && -- raw_spin_trylock(&logbuf_lock)) { -+ if ((this_cpu_read(printk_context) & PRINTK_NMI_DIRECT_CONTEXT_MASK)) { -+ unsigned long flags; - int len; - -+ printk_safe_enter_irqsave(flags); - len = vprintk_store(0, LOGLEVEL_DEFAULT, NULL, fmt, args); -- raw_spin_unlock(&logbuf_lock); -+ printk_safe_exit_irqrestore(flags); - defer_console_output(); - return len; - } - -- /* Use extra buffer in NMI when logbuf_lock is taken or in safe mode. */ -+ /* Use extra buffer in NMI. */ - if (this_cpu_read(printk_context) & PRINTK_NMI_CONTEXT_MASK) - return vprintk_nmi(fmt, args); - --- -2.30.2 - diff --git a/debian/patches-rt/0094-printk-kmsg_dump-remove-_nolock-variants.patch b/debian/patches-rt/0094-printk-kmsg_dump-remove-_nolock-variants.patch deleted file mode 100644 index 66e23326f..000000000 --- a/debian/patches-rt/0094-printk-kmsg_dump-remove-_nolock-variants.patch +++ /dev/null @@ -1,226 +0,0 @@ -From bb2c9a264cfee422a28cfe89e8784e8531ef1bc0 Mon Sep 17 00:00:00 2001 -From: John Ogness <john.ogness@linutronix.de> -Date: Mon, 21 Dec 2020 10:27:58 +0106 -Subject: [PATCH 094/296] printk: kmsg_dump: remove _nolock() variants -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -kmsg_dump_rewind() and kmsg_dump_get_line() are lockless, so there is -no need for _nolock() variants. Remove these functions and switch all -callers of the _nolock() variants. - -The functions without _nolock() were chosen because they are already -exported to kernel modules. - -Signed-off-by: John Ogness <john.ogness@linutronix.de> ---- - arch/powerpc/xmon/xmon.c | 4 +-- - include/linux/kmsg_dump.h | 18 +---------- - kernel/debug/kdb/kdb_main.c | 8 ++--- - kernel/printk/printk.c | 60 +++++-------------------------------- - 4 files changed, 15 insertions(+), 75 deletions(-) - -diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c -index b71690e8245b..d62b8e053d4c 100644 ---- a/arch/powerpc/xmon/xmon.c -+++ b/arch/powerpc/xmon/xmon.c -@@ -3017,9 +3017,9 @@ dump_log_buf(void) - catch_memory_errors = 1; - sync(); - -- kmsg_dump_rewind_nolock(&iter); -+ kmsg_dump_rewind(&iter); - xmon_start_pagination(); -- while (kmsg_dump_get_line_nolock(&iter, false, buf, sizeof(buf), &len)) { -+ while (kmsg_dump_get_line(&iter, false, buf, sizeof(buf), &len)) { - buf[len] = '\0'; - printf("%s", buf); - } -diff --git a/include/linux/kmsg_dump.h b/include/linux/kmsg_dump.h -index 2fdb10ab1799..86673930c8ea 100644 ---- a/include/linux/kmsg_dump.h -+++ b/include/linux/kmsg_dump.h -@@ -60,18 +60,13 @@ struct kmsg_dumper { - #ifdef CONFIG_PRINTK - void kmsg_dump(enum kmsg_dump_reason reason); - --bool kmsg_dump_get_line_nolock(struct kmsg_dumper_iter *iter, bool syslog, -- char *line, size_t size, size_t *len); -- - bool kmsg_dump_get_line(struct kmsg_dumper_iter *iter, bool syslog, - char *line, size_t size, size_t *len); - - bool kmsg_dump_get_buffer(struct kmsg_dumper_iter *iter, bool syslog, - char *buf, size_t size, size_t *len_out); - --void kmsg_dump_rewind_nolock(struct kmsg_dumper_iter *iter); -- --void kmsg_dump_rewind(struct kmsg_dumper_iter *dumper_iter); -+void kmsg_dump_rewind(struct kmsg_dumper_iter *iter); - - int kmsg_dump_register(struct kmsg_dumper *dumper); - -@@ -83,13 +78,6 @@ static inline void kmsg_dump(enum kmsg_dump_reason reason) - { - } - --static inline bool kmsg_dump_get_line_nolock(struct kmsg_dumper_iter *iter, -- bool syslog, const char *line, -- size_t size, size_t *len) --{ -- return false; --} -- - static inline bool kmsg_dump_get_line(struct kmsg_dumper_iter *iter, bool syslog, - const char *line, size_t size, size_t *len) - { -@@ -102,10 +90,6 @@ static inline bool kmsg_dump_get_buffer(struct kmsg_dumper_iter *iter, bool sysl - return false; - } - --static inline void kmsg_dump_rewind_nolock(struct kmsg_dumper_iter *iter) --{ --} -- - static inline void kmsg_dump_rewind(struct kmsg_dumper_iter *iter) - { - } -diff --git a/kernel/debug/kdb/kdb_main.c b/kernel/debug/kdb/kdb_main.c -index 7ae9da245e4b..dbf1d126ac5e 100644 ---- a/kernel/debug/kdb/kdb_main.c -+++ b/kernel/debug/kdb/kdb_main.c -@@ -2126,8 +2126,8 @@ static int kdb_dmesg(int argc, const char **argv) - kdb_set(2, setargs); - } - -- kmsg_dump_rewind_nolock(&iter); -- while (kmsg_dump_get_line_nolock(&iter, 1, NULL, 0, NULL)) -+ kmsg_dump_rewind(&iter); -+ while (kmsg_dump_get_line(&iter, 1, NULL, 0, NULL)) - n++; - - if (lines < 0) { -@@ -2159,8 +2159,8 @@ static int kdb_dmesg(int argc, const char **argv) - if (skip >= n || skip < 0) - return 0; - -- kmsg_dump_rewind_nolock(&iter); -- while (kmsg_dump_get_line_nolock(&iter, 1, buf, sizeof(buf), &len)) { -+ kmsg_dump_rewind(&iter); -+ while (kmsg_dump_get_line(&iter, 1, buf, sizeof(buf), &len)) { - if (skip) { - skip--; - continue; -diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c -index d4e3111e3bc0..d5931e82b73d 100644 ---- a/kernel/printk/printk.c -+++ b/kernel/printk/printk.c -@@ -3378,7 +3378,7 @@ void kmsg_dump(enum kmsg_dump_reason reason) - } - - /** -- * kmsg_dump_get_line_nolock - retrieve one kmsg log line (unlocked version) -+ * kmsg_dump_get_line - retrieve one kmsg log line - * @iter: kmsg dumper iterator - * @syslog: include the "<4>" prefixes - * @line: buffer to copy the line to -@@ -3393,18 +3393,18 @@ void kmsg_dump(enum kmsg_dump_reason reason) - * - * A return value of FALSE indicates that there are no more records to - * read. -- * -- * The function is similar to kmsg_dump_get_line(), but grabs no locks. - */ --bool kmsg_dump_get_line_nolock(struct kmsg_dumper_iter *iter, bool syslog, -- char *line, size_t size, size_t *len) -+bool kmsg_dump_get_line(struct kmsg_dumper_iter *iter, bool syslog, -+ char *line, size_t size, size_t *len) - { - struct printk_info info; - unsigned int line_count; - struct printk_record r; -+ unsigned long flags; - size_t l = 0; - bool ret = false; - -+ printk_safe_enter_irqsave(flags); - prb_rec_init_rd(&r, &info, line, size); - - if (!iter->active) -@@ -3428,40 +3428,11 @@ bool kmsg_dump_get_line_nolock(struct kmsg_dumper_iter *iter, bool syslog, - iter->cur_seq = r.info->seq + 1; - ret = true; - out: -+ printk_safe_exit_irqrestore(flags); - if (len) - *len = l; - return ret; - } -- --/** -- * kmsg_dump_get_line - retrieve one kmsg log line -- * @iter: kmsg dumper iterator -- * @syslog: include the "<4>" prefixes -- * @line: buffer to copy the line to -- * @size: maximum size of the buffer -- * @len: length of line placed into buffer -- * -- * Start at the beginning of the kmsg buffer, with the oldest kmsg -- * record, and copy one record into the provided buffer. -- * -- * Consecutive calls will return the next available record moving -- * towards the end of the buffer with the youngest messages. -- * -- * A return value of FALSE indicates that there are no more records to -- * read. -- */ --bool kmsg_dump_get_line(struct kmsg_dumper_iter *iter, bool syslog, -- char *line, size_t size, size_t *len) --{ -- unsigned long flags; -- bool ret; -- -- printk_safe_enter_irqsave(flags); -- ret = kmsg_dump_get_line_nolock(iter, syslog, line, size, len); -- printk_safe_exit_irqrestore(flags); -- -- return ret; --} - EXPORT_SYMBOL_GPL(kmsg_dump_get_line); - - /** -@@ -3550,22 +3521,6 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper_iter *iter, bool syslog, - } - EXPORT_SYMBOL_GPL(kmsg_dump_get_buffer); - --/** -- * kmsg_dump_rewind_nolock - reset the iterator (unlocked version) -- * @iter: kmsg dumper iterator -- * -- * Reset the dumper's iterator so that kmsg_dump_get_line() and -- * kmsg_dump_get_buffer() can be called again and used multiple -- * times within the same dumper.dump() callback. -- * -- * The function is similar to kmsg_dump_rewind(), but grabs no locks. -- */ --void kmsg_dump_rewind_nolock(struct kmsg_dumper_iter *iter) --{ -- iter->cur_seq = latched_seq_read_nolock(&clear_seq); -- iter->next_seq = prb_next_seq(prb); --} -- - /** - * kmsg_dump_rewind - reset the iterator - * @iter: kmsg dumper iterator -@@ -3579,7 +3534,8 @@ void kmsg_dump_rewind(struct kmsg_dumper_iter *iter) - unsigned long flags; - - printk_safe_enter_irqsave(flags); -- kmsg_dump_rewind_nolock(iter); -+ iter->cur_seq = latched_seq_read_nolock(&clear_seq); -+ iter->next_seq = prb_next_seq(prb); - printk_safe_exit_irqrestore(flags); - } - EXPORT_SYMBOL_GPL(kmsg_dump_rewind); --- -2.30.2 - diff --git a/debian/patches-rt/0095-printk-kmsg_dump-use-kmsg_dump_rewind.patch b/debian/patches-rt/0095-printk-kmsg_dump-use-kmsg_dump_rewind.patch deleted file mode 100644 index 9505227cd..000000000 --- a/debian/patches-rt/0095-printk-kmsg_dump-use-kmsg_dump_rewind.patch +++ /dev/null @@ -1,42 +0,0 @@ -From 15d83e932925b7d7e759df9d4819b313f63de2cc Mon Sep 17 00:00:00 2001 -From: John Ogness <john.ogness@linutronix.de> -Date: Wed, 17 Feb 2021 18:23:16 +0100 -Subject: [PATCH 095/296] printk: kmsg_dump: use kmsg_dump_rewind -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -kmsg_dump() is open coding the kmsg_dump_rewind(). Call -kmsg_dump_rewind() instead. - -Signed-off-by: John Ogness <john.ogness@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/printk/printk.c | 6 +----- - 1 file changed, 1 insertion(+), 5 deletions(-) - -diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c -index d5931e82b73d..d09eb82785e1 100644 ---- a/kernel/printk/printk.c -+++ b/kernel/printk/printk.c -@@ -3347,7 +3347,6 @@ void kmsg_dump(enum kmsg_dump_reason reason) - { - struct kmsg_dumper_iter iter; - struct kmsg_dumper *dumper; -- unsigned long flags; - - rcu_read_lock(); - list_for_each_entry_rcu(dumper, &dump_list, list) { -@@ -3366,10 +3365,7 @@ void kmsg_dump(enum kmsg_dump_reason reason) - - /* initialize iterator with data about the stored records */ - iter.active = true; -- printk_safe_enter_irqsave(flags); -- iter.cur_seq = latched_seq_read_nolock(&clear_seq); -- iter.next_seq = prb_next_seq(prb); -- printk_safe_exit_irqrestore(flags); -+ kmsg_dump_rewind(&iter); - - /* invoke dumper which will iterate over records */ - dumper->dump(dumper, reason, &iter); --- -2.30.2 - diff --git a/debian/patches-rt/0096-printk-console-remove-unnecessary-safe-buffer-usage.patch b/debian/patches-rt/0096-printk-console-remove-unnecessary-safe-buffer-usage.patch deleted file mode 100644 index 34e5dabdb..000000000 --- a/debian/patches-rt/0096-printk-console-remove-unnecessary-safe-buffer-usage.patch +++ /dev/null @@ -1,48 +0,0 @@ -From 9fca9598dad5e2d887fcd4a88baf518f1881fb0c Mon Sep 17 00:00:00 2001 -From: John Ogness <john.ogness@linutronix.de> -Date: Wed, 17 Feb 2021 18:28:05 +0100 -Subject: [PATCH 096/296] printk: console: remove unnecessary safe buffer usage -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Upon registering a console, safe buffers are activated when setting -up the sequence number to replay the log. However, these are already -protected by @console_sem and @syslog_lock. Remove the unnecessary -safe buffer usage. - -Signed-off-by: John Ogness <john.ogness@linutronix.de> ---- - kernel/printk/printk.c | 10 +++------- - 1 file changed, 3 insertions(+), 7 deletions(-) - -diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c -index d09eb82785e1..f097f78b84e6 100644 ---- a/kernel/printk/printk.c -+++ b/kernel/printk/printk.c -@@ -2961,9 +2961,7 @@ void register_console(struct console *newcon) - /* - * console_unlock(); will print out the buffered messages - * for us. -- */ -- printk_safe_enter_irqsave(flags); -- /* -+ * - * We're about to replay the log buffer. Only do this to the - * just-registered console to avoid excessive message spam to - * the already-registered consoles. -@@ -2976,11 +2974,9 @@ void register_console(struct console *newcon) - exclusive_console_stop_seq = console_seq; - - /* Get a consistent copy of @syslog_seq. */ -- raw_spin_lock(&syslog_lock); -+ raw_spin_lock_irqsave(&syslog_lock, flags); - console_seq = syslog_seq; -- raw_spin_unlock(&syslog_lock); -- -- printk_safe_exit_irqrestore(flags); -+ raw_spin_unlock_irqrestore(&syslog_lock, flags); - } - console_unlock(); - console_sysfs_notify(); --- -2.30.2 - diff --git a/debian/patches-rt/0097-printk-track-limit-recursion.patch b/debian/patches-rt/0097-printk-track-limit-recursion.patch deleted file mode 100644 index eeb56404b..000000000 --- a/debian/patches-rt/0097-printk-track-limit-recursion.patch +++ /dev/null @@ -1,143 +0,0 @@ -From 87c43f41a3330539e982f9e874704b04765ab4aa Mon Sep 17 00:00:00 2001 -From: John Ogness <john.ogness@linutronix.de> -Date: Fri, 11 Dec 2020 00:55:25 +0106 -Subject: [PATCH 097/296] printk: track/limit recursion -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Limit printk() recursion to 1 level. This is enough to print a -stacktrace for the printk call, should a WARN or BUG occur. - -Signed-off-by: John Ogness <john.ogness@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/printk/printk.c | 74 ++++++++++++++++++++++++++++++++++++++++-- - 1 file changed, 71 insertions(+), 3 deletions(-) - -diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c -index f097f78b84e6..26b59e8fd2a0 100644 ---- a/kernel/printk/printk.c -+++ b/kernel/printk/printk.c -@@ -1941,6 +1941,65 @@ static void call_console_drivers(const char *ext_text, size_t ext_len, - } - } - -+#ifdef CONFIG_PRINTK_NMI -+#define NUM_RECURSION_CTX 2 -+#else -+#define NUM_RECURSION_CTX 1 -+#endif -+ -+struct printk_recursion { -+ char count[NUM_RECURSION_CTX]; -+}; -+ -+static DEFINE_PER_CPU(struct printk_recursion, percpu_printk_recursion); -+static char printk_recursion_count[NUM_RECURSION_CTX]; -+ -+static char *printk_recursion_counter(void) -+{ -+ struct printk_recursion *rec; -+ char *count; -+ -+ if (!printk_percpu_data_ready()) { -+ count = &printk_recursion_count[0]; -+ } else { -+ rec = this_cpu_ptr(&percpu_printk_recursion); -+ -+ count = &rec->count[0]; -+ } -+ -+#ifdef CONFIG_PRINTK_NMI -+ if (in_nmi()) -+ count++; -+#endif -+ -+ return count; -+} -+ -+static bool printk_enter_irqsave(unsigned long *flags) -+{ -+ char *count; -+ -+ local_irq_save(*flags); -+ count = printk_recursion_counter(); -+ /* Only 1 level of recursion allowed. */ -+ if (*count > 1) { -+ local_irq_restore(*flags); -+ return false; -+ } -+ (*count)++; -+ -+ return true; -+} -+ -+static void printk_exit_irqrestore(unsigned long flags) -+{ -+ char *count; -+ -+ count = printk_recursion_counter(); -+ (*count)--; -+ local_irq_restore(flags); -+} -+ - int printk_delay_msec __read_mostly; - - static inline void printk_delay(void) -@@ -2041,11 +2100,13 @@ int vprintk_store(int facility, int level, - struct prb_reserved_entry e; - enum log_flags lflags = 0; - struct printk_record r; -+ unsigned long irqflags; - u16 trunc_msg_len = 0; - char prefix_buf[8]; - u16 reserve_size; - va_list args2; - u16 text_len; -+ int ret = 0; - u64 ts_nsec; - - /* -@@ -2056,6 +2117,9 @@ int vprintk_store(int facility, int level, - */ - ts_nsec = local_clock(); - -+ if (!printk_enter_irqsave(&irqflags)) -+ return 0; -+ - /* - * The sprintf needs to come first since the syslog prefix might be - * passed in as a parameter. An extra byte must be reserved so that -@@ -2093,7 +2157,8 @@ int vprintk_store(int facility, int level, - prb_commit(&e); - } - -- return text_len; -+ ret = text_len; -+ goto out; - } - } - -@@ -2109,7 +2174,7 @@ int vprintk_store(int facility, int level, - - prb_rec_init_wr(&r, reserve_size + trunc_msg_len); - if (!prb_reserve(&e, prb, &r)) -- return 0; -+ goto out; - } - - /* fill message */ -@@ -2131,7 +2196,10 @@ int vprintk_store(int facility, int level, - else - prb_final_commit(&e); - -- return (text_len + trunc_msg_len); -+ ret = text_len + trunc_msg_len; -+out: -+ printk_exit_irqrestore(irqflags); -+ return ret; - } - - asmlinkage int vprintk_emit(int facility, int level, --- -2.30.2 - diff --git a/debian/patches-rt/0098-printk-remove-safe-buffers.patch b/debian/patches-rt/0098-printk-remove-safe-buffers.patch deleted file mode 100644 index 3c681dc48..000000000 --- a/debian/patches-rt/0098-printk-remove-safe-buffers.patch +++ /dev/null @@ -1,877 +0,0 @@ -From d1e22580f18e64a05d1a9255f5601c4fb005e21f Mon Sep 17 00:00:00 2001 -From: John Ogness <john.ogness@linutronix.de> -Date: Mon, 30 Nov 2020 01:42:00 +0106 -Subject: [PATCH 098/296] printk: remove safe buffers -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -With @logbuf_lock removed, the high level printk functions for -storing messages are lockless. Messages can be stored from any -context, so there is no need for the NMI and safe buffers anymore. - -Remove the NMI and safe buffers. In NMI or safe contexts, store -the message immediately but still use irq_work to defer the console -printing. - -Signed-off-by: John Ogness <john.ogness@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/powerpc/kernel/traps.c | 1 - - arch/powerpc/kernel/watchdog.c | 5 - - include/linux/printk.h | 10 - - kernel/kexec_core.c | 1 - - kernel/panic.c | 3 - - kernel/printk/internal.h | 2 - - kernel/printk/printk.c | 85 ++------- - kernel/printk/printk_safe.c | 329 +-------------------------------- - lib/nmi_backtrace.c | 6 - - 9 files changed, 17 insertions(+), 425 deletions(-) - -diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c -index 77dffea3d537..fc82a7670d37 100644 ---- a/arch/powerpc/kernel/traps.c -+++ b/arch/powerpc/kernel/traps.c -@@ -170,7 +170,6 @@ extern void panic_flush_kmsg_start(void) - - extern void panic_flush_kmsg_end(void) - { -- printk_safe_flush_on_panic(); - kmsg_dump(KMSG_DUMP_PANIC); - bust_spinlocks(0); - debug_locks_off(); -diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c -index af3c15a1d41e..8ae46c5945d0 100644 ---- a/arch/powerpc/kernel/watchdog.c -+++ b/arch/powerpc/kernel/watchdog.c -@@ -181,11 +181,6 @@ static void watchdog_smp_panic(int cpu, u64 tb) - - wd_smp_unlock(&flags); - -- printk_safe_flush(); -- /* -- * printk_safe_flush() seems to require another print -- * before anything actually goes out to console. -- */ - if (sysctl_hardlockup_all_cpu_backtrace) - trigger_allbutself_cpu_backtrace(); - -diff --git a/include/linux/printk.h b/include/linux/printk.h -index fe7eb2351610..2476796c1150 100644 ---- a/include/linux/printk.h -+++ b/include/linux/printk.h -@@ -207,8 +207,6 @@ __printf(1, 2) void dump_stack_set_arch_desc(const char *fmt, ...); - void dump_stack_print_info(const char *log_lvl); - void show_regs_print_info(const char *log_lvl); - extern asmlinkage void dump_stack(void) __cold; --extern void printk_safe_flush(void); --extern void printk_safe_flush_on_panic(void); - #else - static inline __printf(1, 0) - int vprintk(const char *s, va_list args) -@@ -272,14 +270,6 @@ static inline void show_regs_print_info(const char *log_lvl) - static inline void dump_stack(void) - { - } -- --static inline void printk_safe_flush(void) --{ --} -- --static inline void printk_safe_flush_on_panic(void) --{ --} - #endif - - extern int kptr_restrict; -diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c -index c589c7a9562c..fb0ca1a805b5 100644 ---- a/kernel/kexec_core.c -+++ b/kernel/kexec_core.c -@@ -978,7 +978,6 @@ void crash_kexec(struct pt_regs *regs) - old_cpu = atomic_cmpxchg(&panic_cpu, PANIC_CPU_INVALID, this_cpu); - if (old_cpu == PANIC_CPU_INVALID) { - /* This is the 1st CPU which comes here, so go ahead. */ -- printk_safe_flush_on_panic(); - __crash_kexec(regs); - - /* -diff --git a/kernel/panic.c b/kernel/panic.c -index 332736a72a58..1f0df42f8d0c 100644 ---- a/kernel/panic.c -+++ b/kernel/panic.c -@@ -247,7 +247,6 @@ void panic(const char *fmt, ...) - * Bypass the panic_cpu check and call __crash_kexec directly. - */ - if (!_crash_kexec_post_notifiers) { -- printk_safe_flush_on_panic(); - __crash_kexec(NULL); - - /* -@@ -271,8 +270,6 @@ void panic(const char *fmt, ...) - */ - atomic_notifier_call_chain(&panic_notifier_list, 0, buf); - -- /* Call flush even twice. It tries harder with a single online CPU */ -- printk_safe_flush_on_panic(); - kmsg_dump(KMSG_DUMP_PANIC); - - /* -diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h -index e7acc2888c8e..e108b2ece8c7 100644 ---- a/kernel/printk/internal.h -+++ b/kernel/printk/internal.h -@@ -23,7 +23,6 @@ __printf(1, 0) int vprintk_func(const char *fmt, va_list args); - void __printk_safe_enter(void); - void __printk_safe_exit(void); - --void printk_safe_init(void); - bool printk_percpu_data_ready(void); - - #define printk_safe_enter_irqsave(flags) \ -@@ -67,6 +66,5 @@ __printf(1, 0) int vprintk_func(const char *fmt, va_list args) { return 0; } - #define printk_safe_enter_irq() local_irq_disable() - #define printk_safe_exit_irq() local_irq_enable() - --static inline void printk_safe_init(void) { } - static inline bool printk_percpu_data_ready(void) { return false; } - #endif /* CONFIG_PRINTK */ -diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c -index 26b59e8fd2a0..9d487cc4df6e 100644 ---- a/kernel/printk/printk.c -+++ b/kernel/printk/printk.c -@@ -733,27 +733,22 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf, - if (ret) - return ret; - -- printk_safe_enter_irq(); - if (!prb_read_valid(prb, atomic64_read(&user->seq), r)) { - if (file->f_flags & O_NONBLOCK) { - ret = -EAGAIN; -- printk_safe_exit_irq(); - goto out; - } - -- printk_safe_exit_irq(); - ret = wait_event_interruptible(log_wait, - prb_read_valid(prb, atomic64_read(&user->seq), r)); - if (ret) - goto out; -- printk_safe_enter_irq(); - } - - if (r->info->seq != atomic64_read(&user->seq)) { - /* our last seen message is gone, return error and reset */ - atomic64_set(&user->seq, r->info->seq); - ret = -EPIPE; -- printk_safe_exit_irq(); - goto out; - } - -@@ -763,7 +758,6 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf, - &r->info->dev_info); - - atomic64_set(&user->seq, r->info->seq + 1); -- printk_safe_exit_irq(); - - if (len > count) { - ret = -EINVAL; -@@ -798,7 +792,6 @@ static loff_t devkmsg_llseek(struct file *file, loff_t offset, int whence) - if (offset) - return -ESPIPE; - -- printk_safe_enter_irq(); - switch (whence) { - case SEEK_SET: - /* the first record */ -@@ -819,7 +812,6 @@ static loff_t devkmsg_llseek(struct file *file, loff_t offset, int whence) - default: - ret = -EINVAL; - } -- printk_safe_exit_irq(); - return ret; - } - -@@ -834,7 +826,6 @@ static __poll_t devkmsg_poll(struct file *file, poll_table *wait) - - poll_wait(file, &log_wait, wait); - -- printk_safe_enter_irq(); - if (prb_read_valid_info(prb, atomic64_read(&user->seq), &info, NULL)) { - /* return error when data has vanished underneath us */ - if (info.seq != atomic64_read(&user->seq)) -@@ -842,7 +833,6 @@ static __poll_t devkmsg_poll(struct file *file, poll_table *wait) - else - ret = EPOLLIN|EPOLLRDNORM; - } -- printk_safe_exit_irq(); - - return ret; - } -@@ -875,9 +865,7 @@ static int devkmsg_open(struct inode *inode, struct file *file) - prb_rec_init_rd(&user->record, &user->info, - &user->text_buf[0], sizeof(user->text_buf)); - -- printk_safe_enter_irq(); - atomic64_set(&user->seq, prb_first_valid_seq(prb)); -- printk_safe_exit_irq(); - - file->private_data = user; - return 0; -@@ -1043,9 +1031,6 @@ static inline void log_buf_add_cpu(void) {} - - static void __init set_percpu_data_ready(void) - { -- printk_safe_init(); -- /* Make sure we set this flag only after printk_safe() init is done */ -- barrier(); - __printk_percpu_data_ready = true; - } - -@@ -1143,8 +1128,6 @@ void __init setup_log_buf(int early) - new_descs, ilog2(new_descs_count), - new_infos); - -- printk_safe_enter_irqsave(flags); -- - log_buf_len = new_log_buf_len; - log_buf = new_log_buf; - new_log_buf_len = 0; -@@ -1160,8 +1143,6 @@ void __init setup_log_buf(int early) - */ - prb = &printk_rb_dynamic; - -- printk_safe_exit_irqrestore(flags); -- - if (seq != prb_next_seq(&printk_rb_static)) { - pr_err("dropped %llu messages\n", - prb_next_seq(&printk_rb_static) - seq); -@@ -1499,11 +1480,9 @@ static int syslog_print(char __user *buf, int size) - size_t n; - size_t skip; - -- printk_safe_enter_irq(); -- raw_spin_lock(&syslog_lock); -+ raw_spin_lock_irq(&syslog_lock); - if (!prb_read_valid(prb, syslog_seq, &r)) { -- raw_spin_unlock(&syslog_lock); -- printk_safe_exit_irq(); -+ raw_spin_unlock_irq(&syslog_lock); - break; - } - if (r.info->seq != syslog_seq) { -@@ -1532,8 +1511,7 @@ static int syslog_print(char __user *buf, int size) - syslog_partial += n; - } else - n = 0; -- raw_spin_unlock(&syslog_lock); -- printk_safe_exit_irq(); -+ raw_spin_unlock_irq(&syslog_lock); - - if (!n) - break; -@@ -1567,7 +1545,6 @@ static int syslog_print_all(char __user *buf, int size, bool clear) - return -ENOMEM; - - time = printk_time; -- printk_safe_enter_irq(); - /* - * Find first record that fits, including all following records, - * into the user-provided buffer for this dump. -@@ -1588,23 +1565,20 @@ static int syslog_print_all(char __user *buf, int size, bool clear) - break; - } - -- printk_safe_exit_irq(); - if (copy_to_user(buf + len, text, textlen)) - len = -EFAULT; - else - len += textlen; -- printk_safe_enter_irq(); - - if (len < 0) - break; - } - - if (clear) { -- raw_spin_lock(&syslog_lock); -+ raw_spin_lock_irq(&syslog_lock); - latched_seq_write(&clear_seq, seq); -- raw_spin_unlock(&syslog_lock); -+ raw_spin_unlock_irq(&syslog_lock); - } -- printk_safe_exit_irq(); - - kfree(text); - return len; -@@ -1612,11 +1586,9 @@ static int syslog_print_all(char __user *buf, int size, bool clear) - - static void syslog_clear(void) - { -- printk_safe_enter_irq(); -- raw_spin_lock(&syslog_lock); -+ raw_spin_lock_irq(&syslog_lock); - latched_seq_write(&clear_seq, prb_next_seq(prb)); -- raw_spin_unlock(&syslog_lock); -- printk_safe_exit_irq(); -+ raw_spin_unlock_irq(&syslog_lock); - } - - /* Return a consistent copy of @syslog_seq. */ -@@ -1704,12 +1676,10 @@ int do_syslog(int type, char __user *buf, int len, int source) - break; - /* Number of chars in the log buffer */ - case SYSLOG_ACTION_SIZE_UNREAD: -- printk_safe_enter_irq(); -- raw_spin_lock(&syslog_lock); -+ raw_spin_lock_irq(&syslog_lock); - if (!prb_read_valid_info(prb, syslog_seq, &info, NULL)) { - /* No unread messages. */ -- raw_spin_unlock(&syslog_lock); -- printk_safe_exit_irq(); -+ raw_spin_unlock_irq(&syslog_lock); - return 0; - } - if (info.seq != syslog_seq) { -@@ -1737,8 +1707,7 @@ int do_syslog(int type, char __user *buf, int len, int source) - } - error -= syslog_partial; - } -- raw_spin_unlock(&syslog_lock); -- printk_safe_exit_irq(); -+ raw_spin_unlock_irq(&syslog_lock); - break; - /* Size of the log buffer */ - case SYSLOG_ACTION_SIZE_BUFFER: -@@ -2208,7 +2177,6 @@ asmlinkage int vprintk_emit(int facility, int level, - { - int printed_len; - bool in_sched = false; -- unsigned long flags; - - /* Suppress unimportant messages after panic happens */ - if (unlikely(suppress_printk)) -@@ -2222,9 +2190,7 @@ asmlinkage int vprintk_emit(int facility, int level, - boot_delay_msec(level); - printk_delay(); - -- printk_safe_enter_irqsave(flags); - printed_len = vprintk_store(facility, level, dev_info, fmt, args); -- printk_safe_exit_irqrestore(flags); - - /* If called from the scheduler, we can not call up(). */ - if (!in_sched) { -@@ -2609,7 +2575,6 @@ void console_unlock(void) - { - static char ext_text[CONSOLE_EXT_LOG_MAX]; - static char text[CONSOLE_LOG_MAX]; -- unsigned long flags; - bool do_cond_resched, retry; - struct printk_info info; - struct printk_record r; -@@ -2654,7 +2619,6 @@ void console_unlock(void) - size_t ext_len = 0; - size_t len; - -- printk_safe_enter_irqsave(flags); - skip: - if (!prb_read_valid(prb, console_seq, &r)) - break; -@@ -2711,12 +2675,8 @@ void console_unlock(void) - call_console_drivers(ext_text, ext_len, text, len); - start_critical_timings(); - -- if (console_lock_spinning_disable_and_check()) { -- printk_safe_exit_irqrestore(flags); -+ if (console_lock_spinning_disable_and_check()) - return; -- } -- -- printk_safe_exit_irqrestore(flags); - - if (do_cond_resched) - cond_resched(); -@@ -2733,8 +2693,6 @@ void console_unlock(void) - * flush, no worries. - */ - retry = prb_read_valid(prb, console_seq, NULL); -- printk_safe_exit_irqrestore(flags); -- - if (retry && console_trylock()) - goto again; - } -@@ -2796,13 +2754,8 @@ void console_flush_on_panic(enum con_flush_mode mode) - console_trylock(); - console_may_schedule = 0; - -- if (mode == CONSOLE_REPLAY_ALL) { -- unsigned long flags; -- -- printk_safe_enter_irqsave(flags); -+ if (mode == CONSOLE_REPLAY_ALL) - console_seq = prb_first_valid_seq(prb); -- printk_safe_exit_irqrestore(flags); -- } - console_unlock(); - } - -@@ -3460,11 +3413,9 @@ bool kmsg_dump_get_line(struct kmsg_dumper_iter *iter, bool syslog, - struct printk_info info; - unsigned int line_count; - struct printk_record r; -- unsigned long flags; - size_t l = 0; - bool ret = false; - -- printk_safe_enter_irqsave(flags); - prb_rec_init_rd(&r, &info, line, size); - - if (!iter->active) -@@ -3488,7 +3439,6 @@ bool kmsg_dump_get_line(struct kmsg_dumper_iter *iter, bool syslog, - iter->cur_seq = r.info->seq + 1; - ret = true; - out: -- printk_safe_exit_irqrestore(flags); - if (len) - *len = l; - return ret; -@@ -3519,7 +3469,6 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper_iter *iter, bool syslog, - { - struct printk_info info; - struct printk_record r; -- unsigned long flags; - u64 seq; - u64 next_seq; - size_t len = 0; -@@ -3529,7 +3478,6 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper_iter *iter, bool syslog, - if (!iter->active || !buf || !size) - goto out; - -- printk_safe_enter_irqsave(flags); - if (prb_read_valid_info(prb, iter->cur_seq, &info, NULL)) { - if (info.seq != iter->cur_seq) { - /* messages are gone, move to first available one */ -@@ -3538,10 +3486,8 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper_iter *iter, bool syslog, - } - - /* last entry */ -- if (iter->cur_seq >= iter->next_seq) { -- printk_safe_exit_irqrestore(flags); -+ if (iter->cur_seq >= iter->next_seq) - goto out; -- } - - /* - * Find first record that fits, including all following records, -@@ -3573,7 +3519,6 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper_iter *iter, bool syslog, - - iter->next_seq = next_seq; - ret = true; -- printk_safe_exit_irqrestore(flags); - out: - if (len_out) - *len_out = len; -@@ -3591,12 +3536,8 @@ EXPORT_SYMBOL_GPL(kmsg_dump_get_buffer); - */ - void kmsg_dump_rewind(struct kmsg_dumper_iter *iter) - { -- unsigned long flags; -- -- printk_safe_enter_irqsave(flags); - iter->cur_seq = latched_seq_read_nolock(&clear_seq); - iter->next_seq = prb_next_seq(prb); -- printk_safe_exit_irqrestore(flags); - } - EXPORT_SYMBOL_GPL(kmsg_dump_rewind); - -diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c -index 7df8a88d4115..c23b127a6545 100644 ---- a/kernel/printk/printk_safe.c -+++ b/kernel/printk/printk_safe.c -@@ -15,282 +15,9 @@ - - #include "internal.h" - --/* -- * In NMI and safe mode, printk() avoids taking locks. Instead, -- * it uses an alternative implementation that temporary stores -- * the strings into a per-CPU buffer. The content of the buffer -- * is later flushed into the main ring buffer via IRQ work. -- * -- * The alternative implementation is chosen transparently -- * by examining current printk() context mask stored in @printk_context -- * per-CPU variable. -- * -- * The implementation allows to flush the strings also from another CPU. -- * There are situations when we want to make sure that all buffers -- * were handled or when IRQs are blocked. -- */ -- --#define SAFE_LOG_BUF_LEN ((1 << CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT) - \ -- sizeof(atomic_t) - \ -- sizeof(atomic_t) - \ -- sizeof(struct irq_work)) -- --struct printk_safe_seq_buf { -- atomic_t len; /* length of written data */ -- atomic_t message_lost; -- struct irq_work work; /* IRQ work that flushes the buffer */ -- unsigned char buffer[SAFE_LOG_BUF_LEN]; --}; -- --static DEFINE_PER_CPU(struct printk_safe_seq_buf, safe_print_seq); - static DEFINE_PER_CPU(int, printk_context); - --static DEFINE_RAW_SPINLOCK(safe_read_lock); -- --#ifdef CONFIG_PRINTK_NMI --static DEFINE_PER_CPU(struct printk_safe_seq_buf, nmi_print_seq); --#endif -- --/* Get flushed in a more safe context. */ --static void queue_flush_work(struct printk_safe_seq_buf *s) --{ -- if (printk_percpu_data_ready()) -- irq_work_queue(&s->work); --} -- --/* -- * Add a message to per-CPU context-dependent buffer. NMI and printk-safe -- * have dedicated buffers, because otherwise printk-safe preempted by -- * NMI-printk would have overwritten the NMI messages. -- * -- * The messages are flushed from irq work (or from panic()), possibly, -- * from other CPU, concurrently with printk_safe_log_store(). Should this -- * happen, printk_safe_log_store() will notice the buffer->len mismatch -- * and repeat the write. -- */ --static __printf(2, 0) int printk_safe_log_store(struct printk_safe_seq_buf *s, -- const char *fmt, va_list args) --{ -- int add; -- size_t len; -- va_list ap; -- --again: -- len = atomic_read(&s->len); -- -- /* The trailing '\0' is not counted into len. */ -- if (len >= sizeof(s->buffer) - 1) { -- atomic_inc(&s->message_lost); -- queue_flush_work(s); -- return 0; -- } -- -- /* -- * Make sure that all old data have been read before the buffer -- * was reset. This is not needed when we just append data. -- */ -- if (!len) -- smp_rmb(); -- -- va_copy(ap, args); -- add = vscnprintf(s->buffer + len, sizeof(s->buffer) - len, fmt, ap); -- va_end(ap); -- if (!add) -- return 0; -- -- /* -- * Do it once again if the buffer has been flushed in the meantime. -- * Note that atomic_cmpxchg() is an implicit memory barrier that -- * makes sure that the data were written before updating s->len. -- */ -- if (atomic_cmpxchg(&s->len, len, len + add) != len) -- goto again; -- -- queue_flush_work(s); -- return add; --} -- --static inline void printk_safe_flush_line(const char *text, int len) --{ -- /* -- * Avoid any console drivers calls from here, because we may be -- * in NMI or printk_safe context (when in panic). The messages -- * must go only into the ring buffer at this stage. Consoles will -- * get explicitly called later when a crashdump is not generated. -- */ -- printk_deferred("%.*s", len, text); --} -- --/* printk part of the temporary buffer line by line */ --static int printk_safe_flush_buffer(const char *start, size_t len) --{ -- const char *c, *end; -- bool header; -- -- c = start; -- end = start + len; -- header = true; -- -- /* Print line by line. */ -- while (c < end) { -- if (*c == '\n') { -- printk_safe_flush_line(start, c - start + 1); -- start = ++c; -- header = true; -- continue; -- } -- -- /* Handle continuous lines or missing new line. */ -- if ((c + 1 < end) && printk_get_level(c)) { -- if (header) { -- c = printk_skip_level(c); -- continue; -- } -- -- printk_safe_flush_line(start, c - start); -- start = c++; -- header = true; -- continue; -- } -- -- header = false; -- c++; -- } -- -- /* Check if there was a partial line. Ignore pure header. */ -- if (start < end && !header) { -- static const char newline[] = KERN_CONT "\n"; -- -- printk_safe_flush_line(start, end - start); -- printk_safe_flush_line(newline, strlen(newline)); -- } -- -- return len; --} -- --static void report_message_lost(struct printk_safe_seq_buf *s) --{ -- int lost = atomic_xchg(&s->message_lost, 0); -- -- if (lost) -- printk_deferred("Lost %d message(s)!\n", lost); --} -- --/* -- * Flush data from the associated per-CPU buffer. The function -- * can be called either via IRQ work or independently. -- */ --static void __printk_safe_flush(struct irq_work *work) --{ -- struct printk_safe_seq_buf *s = -- container_of(work, struct printk_safe_seq_buf, work); -- unsigned long flags; -- size_t len; -- int i; -- -- /* -- * The lock has two functions. First, one reader has to flush all -- * available message to make the lockless synchronization with -- * writers easier. Second, we do not want to mix messages from -- * different CPUs. This is especially important when printing -- * a backtrace. -- */ -- raw_spin_lock_irqsave(&safe_read_lock, flags); -- -- i = 0; --more: -- len = atomic_read(&s->len); -- -- /* -- * This is just a paranoid check that nobody has manipulated -- * the buffer an unexpected way. If we printed something then -- * @len must only increase. Also it should never overflow the -- * buffer size. -- */ -- if ((i && i >= len) || len > sizeof(s->buffer)) { -- const char *msg = "printk_safe_flush: internal error\n"; -- -- printk_safe_flush_line(msg, strlen(msg)); -- len = 0; -- } -- -- if (!len) -- goto out; /* Someone else has already flushed the buffer. */ -- -- /* Make sure that data has been written up to the @len */ -- smp_rmb(); -- i += printk_safe_flush_buffer(s->buffer + i, len - i); -- -- /* -- * Check that nothing has got added in the meantime and truncate -- * the buffer. Note that atomic_cmpxchg() is an implicit memory -- * barrier that makes sure that the data were copied before -- * updating s->len. -- */ -- if (atomic_cmpxchg(&s->len, len, 0) != len) -- goto more; -- --out: -- report_message_lost(s); -- raw_spin_unlock_irqrestore(&safe_read_lock, flags); --} -- --/** -- * printk_safe_flush - flush all per-cpu nmi buffers. -- * -- * The buffers are flushed automatically via IRQ work. This function -- * is useful only when someone wants to be sure that all buffers have -- * been flushed at some point. -- */ --void printk_safe_flush(void) --{ -- int cpu; -- -- for_each_possible_cpu(cpu) { --#ifdef CONFIG_PRINTK_NMI -- __printk_safe_flush(&per_cpu(nmi_print_seq, cpu).work); --#endif -- __printk_safe_flush(&per_cpu(safe_print_seq, cpu).work); -- } --} -- --/** -- * printk_safe_flush_on_panic - flush all per-cpu nmi buffers when the system -- * goes down. -- * -- * Similar to printk_safe_flush() but it can be called even in NMI context when -- * the system goes down. It does the best effort to get NMI messages into -- * the main ring buffer. -- * -- * Note that it could try harder when there is only one CPU online. -- */ --void printk_safe_flush_on_panic(void) --{ -- if (raw_spin_is_locked(&safe_read_lock)) { -- if (num_online_cpus() > 1) -- return; -- -- debug_locks_off(); -- raw_spin_lock_init(&safe_read_lock); -- } -- -- printk_safe_flush(); --} -- - #ifdef CONFIG_PRINTK_NMI --/* -- * Safe printk() for NMI context. It uses a per-CPU buffer to -- * store the message. NMIs are not nested, so there is always only -- * one writer running. But the buffer might get flushed from another -- * CPU, so we need to be careful. -- */ --static __printf(1, 0) int vprintk_nmi(const char *fmt, va_list args) --{ -- struct printk_safe_seq_buf *s = this_cpu_ptr(&nmi_print_seq); -- -- return printk_safe_log_store(s, fmt, args); --} -- - void noinstr printk_nmi_enter(void) - { - this_cpu_add(printk_context, PRINTK_NMI_CONTEXT_OFFSET); -@@ -305,9 +32,6 @@ void noinstr printk_nmi_exit(void) - * Marks a code that might produce many messages in NMI context - * and the risk of losing them is more critical than eventual - * reordering. -- * -- * It has effect only when called in NMI context. Then printk() -- * will store the messages into the main logbuf directly. - */ - void printk_nmi_direct_enter(void) - { -@@ -320,27 +44,8 @@ void printk_nmi_direct_exit(void) - this_cpu_and(printk_context, ~PRINTK_NMI_DIRECT_CONTEXT_MASK); - } - --#else -- --static __printf(1, 0) int vprintk_nmi(const char *fmt, va_list args) --{ -- return 0; --} -- - #endif /* CONFIG_PRINTK_NMI */ - --/* -- * Lock-less printk(), to avoid deadlocks should the printk() recurse -- * into itself. It uses a per-CPU buffer to store the message, just like -- * NMI. -- */ --static __printf(1, 0) int vprintk_safe(const char *fmt, va_list args) --{ -- struct printk_safe_seq_buf *s = this_cpu_ptr(&safe_print_seq); -- -- return printk_safe_log_store(s, fmt, args); --} -- - /* Can be preempted by NMI. */ - void __printk_safe_enter(void) - { -@@ -365,8 +70,10 @@ __printf(1, 0) int vprintk_func(const char *fmt, va_list args) - * Use the main logbuf even in NMI. But avoid calling console - * drivers that might have their own locks. - */ -- if ((this_cpu_read(printk_context) & PRINTK_NMI_DIRECT_CONTEXT_MASK)) { -- unsigned long flags; -+ if (this_cpu_read(printk_context) & -+ (PRINTK_NMI_DIRECT_CONTEXT_MASK | -+ PRINTK_NMI_CONTEXT_MASK | -+ PRINTK_SAFE_CONTEXT_MASK)) { - int len; - - printk_safe_enter_irqsave(flags); -@@ -376,34 +83,6 @@ __printf(1, 0) int vprintk_func(const char *fmt, va_list args) - return len; - } - -- /* Use extra buffer in NMI. */ -- if (this_cpu_read(printk_context) & PRINTK_NMI_CONTEXT_MASK) -- return vprintk_nmi(fmt, args); -- -- /* Use extra buffer to prevent a recursion deadlock in safe mode. */ -- if (this_cpu_read(printk_context) & PRINTK_SAFE_CONTEXT_MASK) -- return vprintk_safe(fmt, args); -- - /* No obstacles. */ - return vprintk_default(fmt, args); - } -- --void __init printk_safe_init(void) --{ -- int cpu; -- -- for_each_possible_cpu(cpu) { -- struct printk_safe_seq_buf *s; -- -- s = &per_cpu(safe_print_seq, cpu); -- init_irq_work(&s->work, __printk_safe_flush); -- --#ifdef CONFIG_PRINTK_NMI -- s = &per_cpu(nmi_print_seq, cpu); -- init_irq_work(&s->work, __printk_safe_flush); --#endif -- } -- -- /* Flush pending messages that did not have scheduled IRQ works. */ -- printk_safe_flush(); --} -diff --git a/lib/nmi_backtrace.c b/lib/nmi_backtrace.c -index 8abe1870dba4..b09a490f5f70 100644 ---- a/lib/nmi_backtrace.c -+++ b/lib/nmi_backtrace.c -@@ -75,12 +75,6 @@ void nmi_trigger_cpumask_backtrace(const cpumask_t *mask, - touch_softlockup_watchdog(); - } - -- /* -- * Force flush any remote buffers that might be stuck in IRQ context -- * and therefore could not run their irq_work. -- */ -- printk_safe_flush(); -- - clear_bit_unlock(0, &backtrace_flag); - put_cpu(); - } --- -2.30.2 - diff --git a/debian/patches-rt/0099-printk-convert-syslog_lock-to-spin_lock.patch b/debian/patches-rt/0099-printk-convert-syslog_lock-to-spin_lock.patch deleted file mode 100644 index 82034933d..000000000 --- a/debian/patches-rt/0099-printk-convert-syslog_lock-to-spin_lock.patch +++ /dev/null @@ -1,119 +0,0 @@ -From 95476dc899f400d851f5eed90649a69c6634fba1 Mon Sep 17 00:00:00 2001 -From: John Ogness <john.ogness@linutronix.de> -Date: Thu, 18 Feb 2021 17:37:41 +0100 -Subject: [PATCH 099/296] printk: convert @syslog_lock to spin_lock -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Signed-off-by: John Ogness <john.ogness@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/printk/printk.c | 30 +++++++++++++++--------------- - 1 file changed, 15 insertions(+), 15 deletions(-) - -diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c -index 9d487cc4df6e..067aa589b4e8 100644 ---- a/kernel/printk/printk.c -+++ b/kernel/printk/printk.c -@@ -356,7 +356,7 @@ enum log_flags { - }; - - /* syslog_lock protects syslog_* variables and write access to clear_seq. */ --static DEFINE_RAW_SPINLOCK(syslog_lock); -+static DEFINE_SPINLOCK(syslog_lock); - - #ifdef CONFIG_PRINTK - DECLARE_WAIT_QUEUE_HEAD(log_wait); -@@ -1480,9 +1480,9 @@ static int syslog_print(char __user *buf, int size) - size_t n; - size_t skip; - -- raw_spin_lock_irq(&syslog_lock); -+ spin_lock_irq(&syslog_lock); - if (!prb_read_valid(prb, syslog_seq, &r)) { -- raw_spin_unlock_irq(&syslog_lock); -+ spin_unlock_irq(&syslog_lock); - break; - } - if (r.info->seq != syslog_seq) { -@@ -1511,7 +1511,7 @@ static int syslog_print(char __user *buf, int size) - syslog_partial += n; - } else - n = 0; -- raw_spin_unlock_irq(&syslog_lock); -+ spin_unlock_irq(&syslog_lock); - - if (!n) - break; -@@ -1575,9 +1575,9 @@ static int syslog_print_all(char __user *buf, int size, bool clear) - } - - if (clear) { -- raw_spin_lock_irq(&syslog_lock); -+ spin_lock_irq(&syslog_lock); - latched_seq_write(&clear_seq, seq); -- raw_spin_unlock_irq(&syslog_lock); -+ spin_unlock_irq(&syslog_lock); - } - - kfree(text); -@@ -1586,9 +1586,9 @@ static int syslog_print_all(char __user *buf, int size, bool clear) - - static void syslog_clear(void) - { -- raw_spin_lock_irq(&syslog_lock); -+ spin_lock_irq(&syslog_lock); - latched_seq_write(&clear_seq, prb_next_seq(prb)); -- raw_spin_unlock_irq(&syslog_lock); -+ spin_unlock_irq(&syslog_lock); - } - - /* Return a consistent copy of @syslog_seq. */ -@@ -1596,9 +1596,9 @@ static u64 read_syslog_seq_irq(void) - { - u64 seq; - -- raw_spin_lock_irq(&syslog_lock); -+ spin_lock_irq(&syslog_lock); - seq = syslog_seq; -- raw_spin_unlock_irq(&syslog_lock); -+ spin_unlock_irq(&syslog_lock); - - return seq; - } -@@ -1676,10 +1676,10 @@ int do_syslog(int type, char __user *buf, int len, int source) - break; - /* Number of chars in the log buffer */ - case SYSLOG_ACTION_SIZE_UNREAD: -- raw_spin_lock_irq(&syslog_lock); -+ spin_lock_irq(&syslog_lock); - if (!prb_read_valid_info(prb, syslog_seq, &info, NULL)) { - /* No unread messages. */ -- raw_spin_unlock_irq(&syslog_lock); -+ spin_unlock_irq(&syslog_lock); - return 0; - } - if (info.seq != syslog_seq) { -@@ -1707,7 +1707,7 @@ int do_syslog(int type, char __user *buf, int len, int source) - } - error -= syslog_partial; - } -- raw_spin_unlock_irq(&syslog_lock); -+ spin_unlock_irq(&syslog_lock); - break; - /* Size of the log buffer */ - case SYSLOG_ACTION_SIZE_BUFFER: -@@ -2995,9 +2995,9 @@ void register_console(struct console *newcon) - exclusive_console_stop_seq = console_seq; - - /* Get a consistent copy of @syslog_seq. */ -- raw_spin_lock_irqsave(&syslog_lock, flags); -+ spin_lock_irqsave(&syslog_lock, flags); - console_seq = syslog_seq; -- raw_spin_unlock_irqrestore(&syslog_lock, flags); -+ spin_unlock_irqrestore(&syslog_lock, flags); - } - console_unlock(); - console_sysfs_notify(); --- -2.30.2 - diff --git a/debian/patches-rt/0100-console-add-write_atomic-interface.patch b/debian/patches-rt/0100-console-add-write_atomic-interface.patch deleted file mode 100644 index 21db77702..000000000 --- a/debian/patches-rt/0100-console-add-write_atomic-interface.patch +++ /dev/null @@ -1,163 +0,0 @@ -From 1147ced59018b3cf5628d8ec7364715e4f4a2b11 Mon Sep 17 00:00:00 2001 -From: John Ogness <john.ogness@linutronix.de> -Date: Mon, 30 Nov 2020 01:42:01 +0106 -Subject: [PATCH 100/296] console: add write_atomic interface -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Add a write_atomic() callback to the console. This is an optional -function for console drivers. The function must be atomic (including -NMI safe) for writing to the console. - -Console drivers must still implement the write() callback. The -write_atomic() callback will only be used in special situations, -such as when the kernel panics. - -Creating an NMI safe write_atomic() that must synchronize with -write() requires a careful implementation of the console driver. To -aid with the implementation, a set of console_atomic_*() functions -are provided: - - void console_atomic_lock(unsigned int *flags); - void console_atomic_unlock(unsigned int flags); - -These functions synchronize using a processor-reentrant spinlock -(called a cpulock). - -Signed-off-by: John Ogness <john.ogness@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/console.h | 4 ++ - kernel/printk/printk.c | 100 ++++++++++++++++++++++++++++++++++++++++ - 2 files changed, 104 insertions(+) - -diff --git a/include/linux/console.h b/include/linux/console.h -index 4b1e26c4cb42..46c27780ea39 100644 ---- a/include/linux/console.h -+++ b/include/linux/console.h -@@ -141,6 +141,7 @@ static inline int con_debug_leave(void) - struct console { - char name[16]; - void (*write)(struct console *, const char *, unsigned); -+ void (*write_atomic)(struct console *co, const char *s, unsigned int count); - int (*read)(struct console *, char *, unsigned); - struct tty_driver *(*device)(struct console *, int *); - void (*unblank)(void); -@@ -230,4 +231,7 @@ extern void console_init(void); - void dummycon_register_output_notifier(struct notifier_block *nb); - void dummycon_unregister_output_notifier(struct notifier_block *nb); - -+extern void console_atomic_lock(unsigned int *flags); -+extern void console_atomic_unlock(unsigned int flags); -+ - #endif /* _LINUX_CONSOLE_H */ -diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c -index 067aa589b4e8..491558ff40d5 100644 ---- a/kernel/printk/printk.c -+++ b/kernel/printk/printk.c -@@ -3542,3 +3542,103 @@ void kmsg_dump_rewind(struct kmsg_dumper_iter *iter) - EXPORT_SYMBOL_GPL(kmsg_dump_rewind); - - #endif -+ -+struct prb_cpulock { -+ atomic_t owner; -+ unsigned long __percpu *irqflags; -+}; -+ -+#define DECLARE_STATIC_PRINTKRB_CPULOCK(name) \ -+static DEFINE_PER_CPU(unsigned long, _##name##_percpu_irqflags); \ -+static struct prb_cpulock name = { \ -+ .owner = ATOMIC_INIT(-1), \ -+ .irqflags = &_##name##_percpu_irqflags, \ -+} -+ -+static bool __prb_trylock(struct prb_cpulock *cpu_lock, -+ unsigned int *cpu_store) -+{ -+ unsigned long *flags; -+ unsigned int cpu; -+ -+ cpu = get_cpu(); -+ -+ *cpu_store = atomic_read(&cpu_lock->owner); -+ /* memory barrier to ensure the current lock owner is visible */ -+ smp_rmb(); -+ if (*cpu_store == -1) { -+ flags = per_cpu_ptr(cpu_lock->irqflags, cpu); -+ local_irq_save(*flags); -+ if (atomic_try_cmpxchg_acquire(&cpu_lock->owner, -+ cpu_store, cpu)) { -+ return true; -+ } -+ local_irq_restore(*flags); -+ } else if (*cpu_store == cpu) { -+ return true; -+ } -+ -+ put_cpu(); -+ return false; -+} -+ -+/* -+ * prb_lock: Perform a processor-reentrant spin lock. -+ * @cpu_lock: A pointer to the lock object. -+ * @cpu_store: A "flags" pointer to store lock status information. -+ * -+ * If no processor has the lock, the calling processor takes the lock and -+ * becomes the owner. If the calling processor is already the owner of the -+ * lock, this function succeeds immediately. If lock is locked by another -+ * processor, this function spins until the calling processor becomes the -+ * owner. -+ * -+ * It is safe to call this function from any context and state. -+ */ -+static void prb_lock(struct prb_cpulock *cpu_lock, unsigned int *cpu_store) -+{ -+ for (;;) { -+ if (__prb_trylock(cpu_lock, cpu_store)) -+ break; -+ cpu_relax(); -+ } -+} -+ -+/* -+ * prb_unlock: Perform a processor-reentrant spin unlock. -+ * @cpu_lock: A pointer to the lock object. -+ * @cpu_store: A "flags" object storing lock status information. -+ * -+ * Release the lock. The calling processor must be the owner of the lock. -+ * -+ * It is safe to call this function from any context and state. -+ */ -+static void prb_unlock(struct prb_cpulock *cpu_lock, unsigned int cpu_store) -+{ -+ unsigned long *flags; -+ unsigned int cpu; -+ -+ cpu = atomic_read(&cpu_lock->owner); -+ atomic_set_release(&cpu_lock->owner, cpu_store); -+ -+ if (cpu_store == -1) { -+ flags = per_cpu_ptr(cpu_lock->irqflags, cpu); -+ local_irq_restore(*flags); -+ } -+ -+ put_cpu(); -+} -+ -+DECLARE_STATIC_PRINTKRB_CPULOCK(printk_cpulock); -+ -+void console_atomic_lock(unsigned int *flags) -+{ -+ prb_lock(&printk_cpulock, flags); -+} -+EXPORT_SYMBOL(console_atomic_lock); -+ -+void console_atomic_unlock(unsigned int flags) -+{ -+ prb_unlock(&printk_cpulock, flags); -+} -+EXPORT_SYMBOL(console_atomic_unlock); --- -2.30.2 - diff --git a/debian/patches-rt/0102-printk-relocate-printk_delay-and-vprintk_default.patch b/debian/patches-rt/0102-printk-relocate-printk_delay-and-vprintk_default.patch deleted file mode 100644 index d88bac685..000000000 --- a/debian/patches-rt/0102-printk-relocate-printk_delay-and-vprintk_default.patch +++ /dev/null @@ -1,89 +0,0 @@ -From ee8e45472cb2e8848df4d470474a8f023a694499 Mon Sep 17 00:00:00 2001 -From: John Ogness <john.ogness@linutronix.de> -Date: Mon, 30 Nov 2020 01:42:03 +0106 -Subject: [PATCH 102/296] printk: relocate printk_delay() and vprintk_default() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Move printk_delay() and vprintk_default() "as is" further up so that -they can be used by new functions in an upcoming commit. - -Signed-off-by: John Ogness <john.ogness@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/printk/printk.c | 40 ++++++++++++++++++++-------------------- - 1 file changed, 20 insertions(+), 20 deletions(-) - -diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c -index 491558ff40d5..157417654b65 100644 ---- a/kernel/printk/printk.c -+++ b/kernel/printk/printk.c -@@ -1726,6 +1726,20 @@ SYSCALL_DEFINE3(syslog, int, type, char __user *, buf, int, len) - return do_syslog(type, buf, len, SYSLOG_FROM_READER); - } - -+int printk_delay_msec __read_mostly; -+ -+static inline void printk_delay(void) -+{ -+ if (unlikely(printk_delay_msec)) { -+ int m = printk_delay_msec; -+ -+ while (m--) { -+ mdelay(1); -+ touch_nmi_watchdog(); -+ } -+ } -+} -+ - /* - * Special console_lock variants that help to reduce the risk of soft-lockups. - * They allow to pass console_lock to another printk() call using a busy wait. -@@ -1969,20 +1983,6 @@ static void printk_exit_irqrestore(unsigned long flags) - local_irq_restore(flags); - } - --int printk_delay_msec __read_mostly; -- --static inline void printk_delay(void) --{ -- if (unlikely(printk_delay_msec)) { -- int m = printk_delay_msec; -- -- while (m--) { -- mdelay(1); -- touch_nmi_watchdog(); -- } -- } --} -- - static inline u32 printk_caller_id(void) - { - return in_task() ? task_pid_nr(current) : -@@ -2215,18 +2215,18 @@ asmlinkage int vprintk_emit(int facility, int level, - } - EXPORT_SYMBOL(vprintk_emit); - --asmlinkage int vprintk(const char *fmt, va_list args) --{ -- return vprintk_func(fmt, args); --} --EXPORT_SYMBOL(vprintk); -- - int vprintk_default(const char *fmt, va_list args) - { - return vprintk_emit(0, LOGLEVEL_DEFAULT, NULL, fmt, args); - } - EXPORT_SYMBOL_GPL(vprintk_default); - -+asmlinkage int vprintk(const char *fmt, va_list args) -+{ -+ return vprintk_func(fmt, args); -+} -+EXPORT_SYMBOL(vprintk); -+ - /** - * printk - print a kernel message - * @fmt: format string --- -2.30.2 - diff --git a/debian/patches-rt/0104-printk-change-console_seq-to-atomic64_t.patch b/debian/patches-rt/0104-printk-change-console_seq-to-atomic64_t.patch deleted file mode 100644 index edefbf8d1..000000000 --- a/debian/patches-rt/0104-printk-change-console_seq-to-atomic64_t.patch +++ /dev/null @@ -1,132 +0,0 @@ -From b106287310be84c0934b83d10b05831ec3e6dc9b Mon Sep 17 00:00:00 2001 -From: John Ogness <john.ogness@linutronix.de> -Date: Mon, 30 Nov 2020 01:42:05 +0106 -Subject: [PATCH 104/296] printk: change @console_seq to atomic64_t -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -In preparation for atomic printing, change @console_seq to atomic -so that it can be accessed without requiring @console_sem. - -Signed-off-by: John Ogness <john.ogness@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/printk/printk.c | 34 +++++++++++++++++++--------------- - 1 file changed, 19 insertions(+), 15 deletions(-) - -diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c -index 0333d11966ac..a3763f25cede 100644 ---- a/kernel/printk/printk.c -+++ b/kernel/printk/printk.c -@@ -366,12 +366,13 @@ static u64 syslog_seq; - static size_t syslog_partial; - static bool syslog_time; - --/* All 3 protected by @console_sem. */ --/* the next printk record to write to the console */ --static u64 console_seq; -+/* Both protected by @console_sem. */ - static u64 exclusive_console_stop_seq; - static unsigned long console_dropped; - -+/* the next printk record to write to the console */ -+static atomic64_t console_seq = ATOMIC64_INIT(0); -+ - struct latched_seq { - seqcount_latch_t latch; - u64 val[2]; -@@ -2271,7 +2272,7 @@ EXPORT_SYMBOL(printk); - #define prb_first_valid_seq(rb) 0 - - static u64 syslog_seq; --static u64 console_seq; -+static atomic64_t console_seq = ATOMIC64_INIT(0); - static u64 exclusive_console_stop_seq; - static unsigned long console_dropped; - -@@ -2579,6 +2580,7 @@ void console_unlock(void) - bool do_cond_resched, retry; - struct printk_info info; - struct printk_record r; -+ u64 seq; - - if (console_suspended) { - up_console_sem(); -@@ -2621,12 +2623,14 @@ void console_unlock(void) - size_t len; - - skip: -- if (!prb_read_valid(prb, console_seq, &r)) -+ seq = atomic64_read(&console_seq); -+ if (!prb_read_valid(prb, seq, &r)) - break; - -- if (console_seq != r.info->seq) { -- console_dropped += r.info->seq - console_seq; -- console_seq = r.info->seq; -+ if (seq != r.info->seq) { -+ console_dropped += r.info->seq - seq; -+ atomic64_set(&console_seq, r.info->seq); -+ seq = r.info->seq; - } - - if (suppress_message_printing(r.info->level)) { -@@ -2635,13 +2639,13 @@ void console_unlock(void) - * directly to the console when we received it, and - * record that has level above the console loglevel. - */ -- console_seq++; -+ atomic64_set(&console_seq, seq + 1); - goto skip; - } - - /* Output to all consoles once old messages replayed. */ - if (unlikely(exclusive_console && -- console_seq >= exclusive_console_stop_seq)) { -+ seq >= exclusive_console_stop_seq)) { - exclusive_console = NULL; - } - -@@ -2662,7 +2666,7 @@ void console_unlock(void) - len = record_print_text(&r, - console_msg_format & MSG_FORMAT_SYSLOG, - printk_time); -- console_seq++; -+ atomic64_set(&console_seq, seq + 1); - - /* - * While actively printing out messages, if another printk() -@@ -2693,7 +2697,7 @@ void console_unlock(void) - * there's a new owner and the console_unlock() from them will do the - * flush, no worries. - */ -- retry = prb_read_valid(prb, console_seq, NULL); -+ retry = prb_read_valid(prb, atomic64_read(&console_seq), NULL); - if (retry && console_trylock()) - goto again; - } -@@ -2756,7 +2760,7 @@ void console_flush_on_panic(enum con_flush_mode mode) - console_may_schedule = 0; - - if (mode == CONSOLE_REPLAY_ALL) -- console_seq = prb_first_valid_seq(prb); -+ atomic64_set(&console_seq, prb_first_valid_seq(prb)); - console_unlock(); - } - -@@ -2993,11 +2997,11 @@ void register_console(struct console *newcon) - * ignores console_lock. - */ - exclusive_console = newcon; -- exclusive_console_stop_seq = console_seq; -+ exclusive_console_stop_seq = atomic64_read(&console_seq); - - /* Get a consistent copy of @syslog_seq. */ - spin_lock_irqsave(&syslog_lock, flags); -- console_seq = syslog_seq; -+ atomic64_set(&console_seq, syslog_seq); - spin_unlock_irqrestore(&syslog_lock, flags); - } - console_unlock(); --- -2.30.2 - diff --git a/debian/patches-rt/0107-printk-remove-deferred-printing.patch b/debian/patches-rt/0107-printk-remove-deferred-printing.patch deleted file mode 100644 index 508127f0a..000000000 --- a/debian/patches-rt/0107-printk-remove-deferred-printing.patch +++ /dev/null @@ -1,432 +0,0 @@ -From e136d38937fed01a70d48851bcdd1c7f1d1d242e Mon Sep 17 00:00:00 2001 -From: John Ogness <john.ogness@linutronix.de> -Date: Mon, 30 Nov 2020 01:42:08 +0106 -Subject: [PATCH 107/296] printk: remove deferred printing -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Since printing occurs either atomically or from the printing -kthread, there is no need for any deferring or tracking possible -recursion paths. Remove all printk context tracking. - -Signed-off-by: John Ogness <john.ogness@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/arm/kernel/smp.c | 2 - - arch/powerpc/kexec/crash.c | 3 -- - include/linux/hardirq.h | 2 - - include/linux/printk.h | 12 ----- - kernel/printk/Makefile | 1 - - kernel/printk/internal.h | 70 ----------------------------- - kernel/printk/printk.c | 58 ++++++++++-------------- - kernel/printk/printk_safe.c | 88 ------------------------------------- - kernel/trace/trace.c | 2 - - 9 files changed, 22 insertions(+), 216 deletions(-) - delete mode 100644 kernel/printk/internal.h - delete mode 100644 kernel/printk/printk_safe.c - -diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c -index 48099c6e1e4a..609b7d3104ea 100644 ---- a/arch/arm/kernel/smp.c -+++ b/arch/arm/kernel/smp.c -@@ -672,9 +672,7 @@ static void do_handle_IPI(int ipinr) - break; - - case IPI_CPU_BACKTRACE: -- printk_nmi_enter(); - nmi_cpu_backtrace(get_irq_regs()); -- printk_nmi_exit(); - break; - - default: -diff --git a/arch/powerpc/kexec/crash.c b/arch/powerpc/kexec/crash.c -index c9a889880214..d488311efab1 100644 ---- a/arch/powerpc/kexec/crash.c -+++ b/arch/powerpc/kexec/crash.c -@@ -311,9 +311,6 @@ void default_machine_crash_shutdown(struct pt_regs *regs) - unsigned int i; - int (*old_handler)(struct pt_regs *regs); - -- /* Avoid hardlocking with irresponsive CPU holding logbuf_lock */ -- printk_nmi_enter(); -- - /* - * This function is only called after the system - * has panicked or is otherwise in a critical state. -diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h -index 754f67ac4326..c35b71f8644a 100644 ---- a/include/linux/hardirq.h -+++ b/include/linux/hardirq.h -@@ -115,7 +115,6 @@ extern void rcu_nmi_exit(void); - do { \ - lockdep_off(); \ - arch_nmi_enter(); \ -- printk_nmi_enter(); \ - BUG_ON(in_nmi() == NMI_MASK); \ - __preempt_count_add(NMI_OFFSET + HARDIRQ_OFFSET); \ - } while (0) -@@ -134,7 +133,6 @@ extern void rcu_nmi_exit(void); - do { \ - BUG_ON(!in_nmi()); \ - __preempt_count_sub(NMI_OFFSET + HARDIRQ_OFFSET); \ -- printk_nmi_exit(); \ - arch_nmi_exit(); \ - lockdep_on(); \ - } while (0) -diff --git a/include/linux/printk.h b/include/linux/printk.h -index 1ebd93581acc..153212445b68 100644 ---- a/include/linux/printk.h -+++ b/include/linux/printk.h -@@ -155,18 +155,6 @@ static inline __printf(1, 2) __cold - void early_printk(const char *s, ...) { } - #endif - --#ifdef CONFIG_PRINTK_NMI --extern void printk_nmi_enter(void); --extern void printk_nmi_exit(void); --extern void printk_nmi_direct_enter(void); --extern void printk_nmi_direct_exit(void); --#else --static inline void printk_nmi_enter(void) { } --static inline void printk_nmi_exit(void) { } --static inline void printk_nmi_direct_enter(void) { } --static inline void printk_nmi_direct_exit(void) { } --#endif /* PRINTK_NMI */ -- - struct dev_printk_info; - - #ifdef CONFIG_PRINTK -diff --git a/kernel/printk/Makefile b/kernel/printk/Makefile -index eee3dc9b60a9..59cb24e25f00 100644 ---- a/kernel/printk/Makefile -+++ b/kernel/printk/Makefile -@@ -1,5 +1,4 @@ - # SPDX-License-Identifier: GPL-2.0-only - obj-y = printk.o --obj-$(CONFIG_PRINTK) += printk_safe.o - obj-$(CONFIG_A11Y_BRAILLE_CONSOLE) += braille.o - obj-$(CONFIG_PRINTK) += printk_ringbuffer.o -diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h -deleted file mode 100644 -index e108b2ece8c7..000000000000 ---- a/kernel/printk/internal.h -+++ /dev/null -@@ -1,70 +0,0 @@ --/* SPDX-License-Identifier: GPL-2.0-or-later */ --/* -- * internal.h - printk internal definitions -- */ --#include <linux/percpu.h> -- --#ifdef CONFIG_PRINTK -- --#define PRINTK_SAFE_CONTEXT_MASK 0x007ffffff --#define PRINTK_NMI_DIRECT_CONTEXT_MASK 0x008000000 --#define PRINTK_NMI_CONTEXT_MASK 0xff0000000 -- --#define PRINTK_NMI_CONTEXT_OFFSET 0x010000000 -- --__printf(4, 0) --int vprintk_store(int facility, int level, -- const struct dev_printk_info *dev_info, -- const char *fmt, va_list args); -- --__printf(1, 0) int vprintk_default(const char *fmt, va_list args); --__printf(1, 0) int vprintk_deferred(const char *fmt, va_list args); --__printf(1, 0) int vprintk_func(const char *fmt, va_list args); --void __printk_safe_enter(void); --void __printk_safe_exit(void); -- --bool printk_percpu_data_ready(void); -- --#define printk_safe_enter_irqsave(flags) \ -- do { \ -- local_irq_save(flags); \ -- __printk_safe_enter(); \ -- } while (0) -- --#define printk_safe_exit_irqrestore(flags) \ -- do { \ -- __printk_safe_exit(); \ -- local_irq_restore(flags); \ -- } while (0) -- --#define printk_safe_enter_irq() \ -- do { \ -- local_irq_disable(); \ -- __printk_safe_enter(); \ -- } while (0) -- --#define printk_safe_exit_irq() \ -- do { \ -- __printk_safe_exit(); \ -- local_irq_enable(); \ -- } while (0) -- --void defer_console_output(void); -- --#else -- --__printf(1, 0) int vprintk_func(const char *fmt, va_list args) { return 0; } -- --/* -- * In !PRINTK builds we still export console_sem -- * semaphore and some of console functions (console_unlock()/etc.), so -- * printk-safe must preserve the existing local IRQ guarantees. -- */ --#define printk_safe_enter_irqsave(flags) local_irq_save(flags) --#define printk_safe_exit_irqrestore(flags) local_irq_restore(flags) -- --#define printk_safe_enter_irq() local_irq_disable() --#define printk_safe_exit_irq() local_irq_enable() -- --static inline bool printk_percpu_data_ready(void) { return false; } --#endif /* CONFIG_PRINTK */ -diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c -index 23c6e3992962..f5fd3e1671fe 100644 ---- a/kernel/printk/printk.c -+++ b/kernel/printk/printk.c -@@ -45,6 +45,7 @@ - #include <linux/ctype.h> - #include <linux/uio.h> - #include <linux/kthread.h> -+#include <linux/kdb.h> - #include <linux/clocksource.h> - #include <linux/sched/clock.h> - #include <linux/sched/debug.h> -@@ -60,7 +61,6 @@ - #include "printk_ringbuffer.h" - #include "console_cmdline.h" - #include "braille.h" --#include "internal.h" - - int console_printk[4] = { - CONSOLE_LOGLEVEL_DEFAULT, /* console_loglevel */ -@@ -227,19 +227,7 @@ static int nr_ext_console_drivers; - - static int __down_trylock_console_sem(unsigned long ip) - { -- int lock_failed; -- unsigned long flags; -- -- /* -- * Here and in __up_console_sem() we need to be in safe mode, -- * because spindump/WARN/etc from under console ->lock will -- * deadlock in printk()->down_trylock_console_sem() otherwise. -- */ -- printk_safe_enter_irqsave(flags); -- lock_failed = down_trylock(&console_sem); -- printk_safe_exit_irqrestore(flags); -- -- if (lock_failed) -+ if (down_trylock(&console_sem)) - return 1; - mutex_acquire(&console_lock_dep_map, 0, 1, ip); - return 0; -@@ -248,13 +236,9 @@ static int __down_trylock_console_sem(unsigned long ip) - - static void __up_console_sem(unsigned long ip) - { -- unsigned long flags; -- - mutex_release(&console_lock_dep_map, ip); - -- printk_safe_enter_irqsave(flags); - up(&console_sem); -- printk_safe_exit_irqrestore(flags); - } - #define up_console_sem() __up_console_sem(_RET_IP_) - -@@ -426,7 +410,7 @@ static struct printk_ringbuffer *prb = &printk_rb_static; - */ - static bool __printk_percpu_data_ready __read_mostly; - --bool printk_percpu_data_ready(void) -+static bool printk_percpu_data_ready(void) - { - return __printk_percpu_data_ready; - } -@@ -1061,7 +1045,6 @@ void __init setup_log_buf(int early) - struct printk_record r; - size_t new_descs_size; - size_t new_infos_size; -- unsigned long flags; - char *new_log_buf; - unsigned int free; - u64 seq; -@@ -1959,9 +1942,9 @@ static u16 printk_sprint(char *text, u16 size, int facility, enum log_flags *lfl - } - - __printf(4, 0) --int vprintk_store(int facility, int level, -- const struct dev_printk_info *dev_info, -- const char *fmt, va_list args) -+static int vprintk_store(int facility, int level, -+ const struct dev_printk_info *dev_info, -+ const char *fmt, va_list args) - { - const u32 caller_id = printk_caller_id(); - struct prb_reserved_entry e; -@@ -2107,11 +2090,22 @@ asmlinkage int vprintk_emit(int facility, int level, - } - EXPORT_SYMBOL(vprintk_emit); - --int vprintk_default(const char *fmt, va_list args) -+__printf(1, 0) -+static int vprintk_default(const char *fmt, va_list args) - { - return vprintk_emit(0, LOGLEVEL_DEFAULT, NULL, fmt, args); - } --EXPORT_SYMBOL_GPL(vprintk_default); -+ -+__printf(1, 0) -+static int vprintk_func(const char *fmt, va_list args) -+{ -+#ifdef CONFIG_KGDB_KDB -+ /* Allow to pass printk() to kdb but avoid a recursion. */ -+ if (unlikely(kdb_trap_printk && kdb_printf_cpu < 0)) -+ return vkdb_printf(KDB_MSGSRC_PRINTK, fmt, args); -+#endif -+ return vprintk_default(fmt, args); -+} - - asmlinkage int vprintk(const char *fmt, va_list args) - { -@@ -3069,18 +3063,10 @@ void wake_up_klogd(void) - preempt_enable(); - } - --void defer_console_output(void) --{ --} -- --int vprintk_deferred(const char *fmt, va_list args) -+__printf(1, 0) -+static int vprintk_deferred(const char *fmt, va_list args) - { -- int r; -- -- r = vprintk_emit(0, LOGLEVEL_SCHED, NULL, fmt, args); -- defer_console_output(); -- -- return r; -+ return vprintk_emit(0, LOGLEVEL_DEFAULT, NULL, fmt, args); - } - - int printk_deferred(const char *fmt, ...) -diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c -deleted file mode 100644 -index c23b127a6545..000000000000 ---- a/kernel/printk/printk_safe.c -+++ /dev/null -@@ -1,88 +0,0 @@ --// SPDX-License-Identifier: GPL-2.0-or-later --/* -- * printk_safe.c - Safe printk for printk-deadlock-prone contexts -- */ -- --#include <linux/preempt.h> --#include <linux/spinlock.h> --#include <linux/debug_locks.h> --#include <linux/kdb.h> --#include <linux/smp.h> --#include <linux/cpumask.h> --#include <linux/irq_work.h> --#include <linux/printk.h> --#include <linux/kprobes.h> -- --#include "internal.h" -- --static DEFINE_PER_CPU(int, printk_context); -- --#ifdef CONFIG_PRINTK_NMI --void noinstr printk_nmi_enter(void) --{ -- this_cpu_add(printk_context, PRINTK_NMI_CONTEXT_OFFSET); --} -- --void noinstr printk_nmi_exit(void) --{ -- this_cpu_sub(printk_context, PRINTK_NMI_CONTEXT_OFFSET); --} -- --/* -- * Marks a code that might produce many messages in NMI context -- * and the risk of losing them is more critical than eventual -- * reordering. -- */ --void printk_nmi_direct_enter(void) --{ -- if (this_cpu_read(printk_context) & PRINTK_NMI_CONTEXT_MASK) -- this_cpu_or(printk_context, PRINTK_NMI_DIRECT_CONTEXT_MASK); --} -- --void printk_nmi_direct_exit(void) --{ -- this_cpu_and(printk_context, ~PRINTK_NMI_DIRECT_CONTEXT_MASK); --} -- --#endif /* CONFIG_PRINTK_NMI */ -- --/* Can be preempted by NMI. */ --void __printk_safe_enter(void) --{ -- this_cpu_inc(printk_context); --} -- --/* Can be preempted by NMI. */ --void __printk_safe_exit(void) --{ -- this_cpu_dec(printk_context); --} -- --__printf(1, 0) int vprintk_func(const char *fmt, va_list args) --{ --#ifdef CONFIG_KGDB_KDB -- /* Allow to pass printk() to kdb but avoid a recursion. */ -- if (unlikely(kdb_trap_printk && kdb_printf_cpu < 0)) -- return vkdb_printf(KDB_MSGSRC_PRINTK, fmt, args); --#endif -- -- /* -- * Use the main logbuf even in NMI. But avoid calling console -- * drivers that might have their own locks. -- */ -- if (this_cpu_read(printk_context) & -- (PRINTK_NMI_DIRECT_CONTEXT_MASK | -- PRINTK_NMI_CONTEXT_MASK | -- PRINTK_SAFE_CONTEXT_MASK)) { -- int len; -- -- printk_safe_enter_irqsave(flags); -- len = vprintk_store(0, LOGLEVEL_DEFAULT, NULL, fmt, args); -- printk_safe_exit_irqrestore(flags); -- defer_console_output(); -- return len; -- } -- -- /* No obstacles. */ -- return vprintk_default(fmt, args); --} -diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c -index 57fb77628df1..ea5ed64017ec 100644 ---- a/kernel/trace/trace.c -+++ b/kernel/trace/trace.c -@@ -9296,7 +9296,6 @@ void ftrace_dump(enum ftrace_dump_mode oops_dump_mode) - tracing_off(); - - local_irq_save(flags); -- printk_nmi_direct_enter(); - - /* Simulate the iterator */ - trace_init_global_iter(&iter); -@@ -9376,7 +9375,6 @@ void ftrace_dump(enum ftrace_dump_mode oops_dump_mode) - atomic_dec(&per_cpu_ptr(iter.array_buffer->data, cpu)->disabled); - } - atomic_dec(&dump_running); -- printk_nmi_direct_exit(); - local_irq_restore(flags); - } - EXPORT_SYMBOL_GPL(ftrace_dump); --- -2.30.2 - diff --git a/debian/patches-rt/0112-tpm-remove-tpm_dev_wq_lock.patch b/debian/patches-rt/0112-tpm-remove-tpm_dev_wq_lock.patch deleted file mode 100644 index 6c70d19c4..000000000 --- a/debian/patches-rt/0112-tpm-remove-tpm_dev_wq_lock.patch +++ /dev/null @@ -1,35 +0,0 @@ -From 24d84605b39e5e0a2f08012879714e8d62a45bec Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Mon, 11 Feb 2019 11:33:11 +0100 -Subject: [PATCH 112/296] tpm: remove tpm_dev_wq_lock -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Added in commit - - 9e1b74a63f776 ("tpm: add support for nonblocking operation") - -but never actually used it. - -Cc: Philip Tricca <philip.b.tricca@intel.com> -Cc: Tadeusz Struk <tadeusz.struk@intel.com> -Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - drivers/char/tpm/tpm-dev-common.c | 1 - - 1 file changed, 1 deletion(-) - -diff --git a/drivers/char/tpm/tpm-dev-common.c b/drivers/char/tpm/tpm-dev-common.c -index 1784530b8387..c08cbb306636 100644 ---- a/drivers/char/tpm/tpm-dev-common.c -+++ b/drivers/char/tpm/tpm-dev-common.c -@@ -20,7 +20,6 @@ - #include "tpm-dev.h" - - static struct workqueue_struct *tpm_dev_wq; --static DEFINE_MUTEX(tpm_dev_wq_lock); - - static ssize_t tpm_dev_transmit(struct tpm_chip *chip, struct tpm_space *space, - u8 *buf, size_t bufsiz) --- -2.30.2 - diff --git a/debian/patches-rt/0113-shmem-Use-raw_spinlock_t-for-stat_lock.patch b/debian/patches-rt/0113-shmem-Use-raw_spinlock_t-for-stat_lock.patch deleted file mode 100644 index 4a7f6ad7c..000000000 --- a/debian/patches-rt/0113-shmem-Use-raw_spinlock_t-for-stat_lock.patch +++ /dev/null @@ -1,147 +0,0 @@ -From 596e27c7622eca1181eaef97c5de007d85e6a99e Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Fri, 14 Aug 2020 18:53:34 +0200 -Subject: [PATCH 113/296] shmem: Use raw_spinlock_t for ->stat_lock -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Each CPU has SHMEM_INO_BATCH inodes available in `->ino_batch' which is -per-CPU. Access here is serialized by disabling preemption. If the pool is -empty, it gets reloaded from `->next_ino'. Access here is serialized by -->stat_lock which is a spinlock_t and can not be acquired with disabled -preemption. -One way around it would make per-CPU ino_batch struct containing the inode -number a local_lock_t. -Another sollution is to promote ->stat_lock to a raw_spinlock_t. The critical -sections are short. The mpol_put() should be moved outside of the critical -section to avoid invoking the destrutor with disabled preemption. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/shmem_fs.h | 2 +- - mm/shmem.c | 31 +++++++++++++++++-------------- - 2 files changed, 18 insertions(+), 15 deletions(-) - -diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h -index a5a5d1d4d7b1..0470d1582b09 100644 ---- a/include/linux/shmem_fs.h -+++ b/include/linux/shmem_fs.h -@@ -31,7 +31,7 @@ struct shmem_sb_info { - struct percpu_counter used_blocks; /* How many are allocated */ - unsigned long max_inodes; /* How many inodes are allowed */ - unsigned long free_inodes; /* How many are left for allocation */ -- spinlock_t stat_lock; /* Serialize shmem_sb_info changes */ -+ raw_spinlock_t stat_lock; /* Serialize shmem_sb_info changes */ - umode_t mode; /* Mount mode for root directory */ - unsigned char huge; /* Whether to try for hugepages */ - kuid_t uid; /* Mount uid for root directory */ -diff --git a/mm/shmem.c b/mm/shmem.c -index 537c137698f8..1c473d6123bc 100644 ---- a/mm/shmem.c -+++ b/mm/shmem.c -@@ -278,10 +278,10 @@ static int shmem_reserve_inode(struct super_block *sb, ino_t *inop) - ino_t ino; - - if (!(sb->s_flags & SB_KERNMOUNT)) { -- spin_lock(&sbinfo->stat_lock); -+ raw_spin_lock(&sbinfo->stat_lock); - if (sbinfo->max_inodes) { - if (!sbinfo->free_inodes) { -- spin_unlock(&sbinfo->stat_lock); -+ raw_spin_unlock(&sbinfo->stat_lock); - return -ENOSPC; - } - sbinfo->free_inodes--; -@@ -304,7 +304,7 @@ static int shmem_reserve_inode(struct super_block *sb, ino_t *inop) - } - *inop = ino; - } -- spin_unlock(&sbinfo->stat_lock); -+ raw_spin_unlock(&sbinfo->stat_lock); - } else if (inop) { - /* - * __shmem_file_setup, one of our callers, is lock-free: it -@@ -319,13 +319,14 @@ static int shmem_reserve_inode(struct super_block *sb, ino_t *inop) - * to worry about things like glibc compatibility. - */ - ino_t *next_ino; -+ - next_ino = per_cpu_ptr(sbinfo->ino_batch, get_cpu()); - ino = *next_ino; - if (unlikely(ino % SHMEM_INO_BATCH == 0)) { -- spin_lock(&sbinfo->stat_lock); -+ raw_spin_lock(&sbinfo->stat_lock); - ino = sbinfo->next_ino; - sbinfo->next_ino += SHMEM_INO_BATCH; -- spin_unlock(&sbinfo->stat_lock); -+ raw_spin_unlock(&sbinfo->stat_lock); - if (unlikely(is_zero_ino(ino))) - ino++; - } -@@ -341,9 +342,9 @@ static void shmem_free_inode(struct super_block *sb) - { - struct shmem_sb_info *sbinfo = SHMEM_SB(sb); - if (sbinfo->max_inodes) { -- spin_lock(&sbinfo->stat_lock); -+ raw_spin_lock(&sbinfo->stat_lock); - sbinfo->free_inodes++; -- spin_unlock(&sbinfo->stat_lock); -+ raw_spin_unlock(&sbinfo->stat_lock); - } - } - -@@ -1479,10 +1480,10 @@ static struct mempolicy *shmem_get_sbmpol(struct shmem_sb_info *sbinfo) - { - struct mempolicy *mpol = NULL; - if (sbinfo->mpol) { -- spin_lock(&sbinfo->stat_lock); /* prevent replace/use races */ -+ raw_spin_lock(&sbinfo->stat_lock); /* prevent replace/use races */ - mpol = sbinfo->mpol; - mpol_get(mpol); -- spin_unlock(&sbinfo->stat_lock); -+ raw_spin_unlock(&sbinfo->stat_lock); - } - return mpol; - } -@@ -3592,9 +3593,10 @@ static int shmem_reconfigure(struct fs_context *fc) - struct shmem_options *ctx = fc->fs_private; - struct shmem_sb_info *sbinfo = SHMEM_SB(fc->root->d_sb); - unsigned long inodes; -+ struct mempolicy *mpol = NULL; - const char *err; - -- spin_lock(&sbinfo->stat_lock); -+ raw_spin_lock(&sbinfo->stat_lock); - inodes = sbinfo->max_inodes - sbinfo->free_inodes; - if ((ctx->seen & SHMEM_SEEN_BLOCKS) && ctx->blocks) { - if (!sbinfo->max_blocks) { -@@ -3639,14 +3641,15 @@ static int shmem_reconfigure(struct fs_context *fc) - * Preserve previous mempolicy unless mpol remount option was specified. - */ - if (ctx->mpol) { -- mpol_put(sbinfo->mpol); -+ mpol = sbinfo->mpol; - sbinfo->mpol = ctx->mpol; /* transfers initial ref */ - ctx->mpol = NULL; - } -- spin_unlock(&sbinfo->stat_lock); -+ raw_spin_unlock(&sbinfo->stat_lock); -+ mpol_put(mpol); - return 0; - out: -- spin_unlock(&sbinfo->stat_lock); -+ raw_spin_unlock(&sbinfo->stat_lock); - return invalfc(fc, "%s", err); - } - -@@ -3763,7 +3766,7 @@ static int shmem_fill_super(struct super_block *sb, struct fs_context *fc) - sbinfo->mpol = ctx->mpol; - ctx->mpol = NULL; - -- spin_lock_init(&sbinfo->stat_lock); -+ raw_spin_lock_init(&sbinfo->stat_lock); - if (percpu_counter_init(&sbinfo->used_blocks, 0, GFP_KERNEL)) - goto failed; - spin_lock_init(&sbinfo->shrinklist_lock); --- -2.30.2 - diff --git a/debian/patches-rt/0114-net-Move-lockdep-where-it-belongs.patch b/debian/patches-rt/0114-net-Move-lockdep-where-it-belongs.patch deleted file mode 100644 index 17cfe8431..000000000 --- a/debian/patches-rt/0114-net-Move-lockdep-where-it-belongs.patch +++ /dev/null @@ -1,46 +0,0 @@ -From ec101b47596c9aa3cfaf4060093c2d8374900e71 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 8 Sep 2020 07:32:20 +0200 -Subject: [PATCH 114/296] net: Move lockdep where it belongs -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - net/core/sock.c | 6 ++---- - 1 file changed, 2 insertions(+), 4 deletions(-) - -diff --git a/net/core/sock.c b/net/core/sock.c -index c75c1e723a84..5dcdc71839e6 100644 ---- a/net/core/sock.c -+++ b/net/core/sock.c -@@ -3031,12 +3031,11 @@ void lock_sock_nested(struct sock *sk, int subclass) - if (sk->sk_lock.owned) - __lock_sock(sk); - sk->sk_lock.owned = 1; -- spin_unlock(&sk->sk_lock.slock); -+ spin_unlock_bh(&sk->sk_lock.slock); - /* - * The sk_lock has mutex_lock() semantics here: - */ - mutex_acquire(&sk->sk_lock.dep_map, subclass, 0, _RET_IP_); -- local_bh_enable(); - } - EXPORT_SYMBOL(lock_sock_nested); - -@@ -3085,12 +3084,11 @@ bool lock_sock_fast(struct sock *sk) - - __lock_sock(sk); - sk->sk_lock.owned = 1; -- spin_unlock(&sk->sk_lock.slock); -+ spin_unlock_bh(&sk->sk_lock.slock); - /* - * The sk_lock has mutex_lock() semantics here: - */ - mutex_acquire(&sk->sk_lock.dep_map, 0, 0, _RET_IP_); -- local_bh_enable(); - return true; - } - EXPORT_SYMBOL(lock_sock_fast); --- -2.30.2 - diff --git a/debian/patches-rt/0116-parisc-Remove-bogus-__IRQ_STAT-macro.patch b/debian/patches-rt/0116-parisc-Remove-bogus-__IRQ_STAT-macro.patch deleted file mode 100644 index 3ec6d50cf..000000000 --- a/debian/patches-rt/0116-parisc-Remove-bogus-__IRQ_STAT-macro.patch +++ /dev/null @@ -1,31 +0,0 @@ -From 3438dd654ff623f2e37777a734abfe8f251cc564 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Fri, 13 Nov 2020 15:02:08 +0100 -Subject: [PATCH 116/296] parisc: Remove bogus __IRQ_STAT macro -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -This is a leftover from a historical array based implementation and unused. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Reviewed-by: Frederic Weisbecker <frederic@kernel.org> -Link: https://lore.kernel.org/r/20201113141732.680780121@linutronix.de -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/parisc/include/asm/hardirq.h | 1 - - 1 file changed, 1 deletion(-) - -diff --git a/arch/parisc/include/asm/hardirq.h b/arch/parisc/include/asm/hardirq.h -index 7f7039516e53..fad29aa6f45f 100644 ---- a/arch/parisc/include/asm/hardirq.h -+++ b/arch/parisc/include/asm/hardirq.h -@@ -32,7 +32,6 @@ typedef struct { - DECLARE_PER_CPU_SHARED_ALIGNED(irq_cpustat_t, irq_stat); - - #define __ARCH_IRQ_STAT --#define __IRQ_STAT(cpu, member) (irq_stat[cpu].member) - #define inc_irq_stat(member) this_cpu_inc(irq_stat.member) - #define __inc_irq_stat(member) __this_cpu_inc(irq_stat.member) - #define ack_bad_irq(irq) WARN(1, "unexpected IRQ trap at vector %02x\n", irq) --- -2.30.2 - diff --git a/debian/patches-rt/0117-sh-Get-rid-of-nmi_count.patch b/debian/patches-rt/0117-sh-Get-rid-of-nmi_count.patch deleted file mode 100644 index 43628adbd..000000000 --- a/debian/patches-rt/0117-sh-Get-rid-of-nmi_count.patch +++ /dev/null @@ -1,47 +0,0 @@ -From cdf54e6506245563bd90c9ecd00c9c1c8c7a1799 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Fri, 13 Nov 2020 15:02:09 +0100 -Subject: [PATCH 117/296] sh: Get rid of nmi_count() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -nmi_count() is a historical leftover and SH is the only user. Replace it -with regular per cpu accessors. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Reviewed-by: Frederic Weisbecker <frederic@kernel.org> -Link: https://lore.kernel.org/r/20201113141732.844232404@linutronix.de -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/sh/kernel/irq.c | 2 +- - arch/sh/kernel/traps.c | 2 +- - 2 files changed, 2 insertions(+), 2 deletions(-) - -diff --git a/arch/sh/kernel/irq.c b/arch/sh/kernel/irq.c -index 5717c7cbdd97..5addcb2c2da0 100644 ---- a/arch/sh/kernel/irq.c -+++ b/arch/sh/kernel/irq.c -@@ -44,7 +44,7 @@ int arch_show_interrupts(struct seq_file *p, int prec) - - seq_printf(p, "%*s: ", prec, "NMI"); - for_each_online_cpu(j) -- seq_printf(p, "%10u ", nmi_count(j)); -+ seq_printf(p, "%10u ", per_cpu(irq_stat.__nmi_count, j); - seq_printf(p, " Non-maskable interrupts\n"); - - seq_printf(p, "%*s: %10u\n", prec, "ERR", atomic_read(&irq_err_count)); -diff --git a/arch/sh/kernel/traps.c b/arch/sh/kernel/traps.c -index 9c3d32b80038..f5beecdac693 100644 ---- a/arch/sh/kernel/traps.c -+++ b/arch/sh/kernel/traps.c -@@ -186,7 +186,7 @@ BUILD_TRAP_HANDLER(nmi) - arch_ftrace_nmi_enter(); - - nmi_enter(); -- nmi_count(cpu)++; -+ this_cpu_inc(irq_stat.__nmi_count); - - switch (notify_die(DIE_NMI, "NMI", regs, 0, vec & 0xff, SIGINT)) { - case NOTIFY_OK: --- -2.30.2 - diff --git a/debian/patches-rt/0118-irqstat-Get-rid-of-nmi_count-and-__IRQ_STAT.patch b/debian/patches-rt/0118-irqstat-Get-rid-of-nmi_count-and-__IRQ_STAT.patch deleted file mode 100644 index 70782d6f5..000000000 --- a/debian/patches-rt/0118-irqstat-Get-rid-of-nmi_count-and-__IRQ_STAT.patch +++ /dev/null @@ -1,34 +0,0 @@ -From 832fce0cc00ddd40c724f3721dd7a9e954c44432 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Fri, 13 Nov 2020 15:02:10 +0100 -Subject: [PATCH 118/296] irqstat: Get rid of nmi_count() and __IRQ_STAT() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Nothing uses this anymore. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Reviewed-by: Frederic Weisbecker <frederic@kernel.org> -Link: https://lore.kernel.org/r/20201113141733.005212732@linutronix.de -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/irq_cpustat.h | 4 ---- - 1 file changed, 4 deletions(-) - -diff --git a/include/linux/irq_cpustat.h b/include/linux/irq_cpustat.h -index 6e8895cd4d92..78fb2de3ea4d 100644 ---- a/include/linux/irq_cpustat.h -+++ b/include/linux/irq_cpustat.h -@@ -19,10 +19,6 @@ - - #ifndef __ARCH_IRQ_STAT - DECLARE_PER_CPU_ALIGNED(irq_cpustat_t, irq_stat); /* defined in asm/hardirq.h */ --#define __IRQ_STAT(cpu, member) (per_cpu(irq_stat.member, cpu)) - #endif - --/* arch dependent irq_stat fields */ --#define nmi_count(cpu) __IRQ_STAT((cpu), __nmi_count) /* i386 */ -- - #endif /* __irq_cpustat_h */ --- -2.30.2 - diff --git a/debian/patches-rt/0119-um-irqstat-Get-rid-of-the-duplicated-declarations.patch b/debian/patches-rt/0119-um-irqstat-Get-rid-of-the-duplicated-declarations.patch deleted file mode 100644 index 477db0bec..000000000 --- a/debian/patches-rt/0119-um-irqstat-Get-rid-of-the-duplicated-declarations.patch +++ /dev/null @@ -1,48 +0,0 @@ -From 22888c4359fd97b6101129b2b142c73df57e5a68 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Fri, 13 Nov 2020 15:02:11 +0100 -Subject: [PATCH 119/296] um/irqstat: Get rid of the duplicated declarations -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -irq_cpustat_t and ack_bad_irq() are exactly the same as the asm-generic -ones. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Reviewed-by: Frederic Weisbecker <frederic@kernel.org> -Link: https://lore.kernel.org/r/20201113141733.156361337@linutronix.de -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/um/include/asm/hardirq.h | 17 +---------------- - 1 file changed, 1 insertion(+), 16 deletions(-) - -diff --git a/arch/um/include/asm/hardirq.h b/arch/um/include/asm/hardirq.h -index b426796d26fd..52e2c36267a9 100644 ---- a/arch/um/include/asm/hardirq.h -+++ b/arch/um/include/asm/hardirq.h -@@ -2,22 +2,7 @@ - #ifndef __ASM_UM_HARDIRQ_H - #define __ASM_UM_HARDIRQ_H - --#include <linux/cache.h> --#include <linux/threads.h> -- --typedef struct { -- unsigned int __softirq_pending; --} ____cacheline_aligned irq_cpustat_t; -- --#include <linux/irq_cpustat.h> /* Standard mappings for irq_cpustat_t above */ --#include <linux/irq.h> -- --#ifndef ack_bad_irq --static inline void ack_bad_irq(unsigned int irq) --{ -- printk(KERN_CRIT "unexpected IRQ trap at vector %02x\n", irq); --} --#endif -+#include <asm-generic/hardirq.h> - - #define __ARCH_IRQ_EXIT_IRQS_DISABLED 1 - --- -2.30.2 - diff --git a/debian/patches-rt/0120-ARM-irqstat-Get-rid-of-duplicated-declaration.patch b/debian/patches-rt/0120-ARM-irqstat-Get-rid-of-duplicated-declaration.patch deleted file mode 100644 index ddc85553b..000000000 --- a/debian/patches-rt/0120-ARM-irqstat-Get-rid-of-duplicated-declaration.patch +++ /dev/null @@ -1,59 +0,0 @@ -From 780248929f56cb3514329e24b3295bb8094949f8 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Fri, 13 Nov 2020 15:02:12 +0100 -Subject: [PATCH 120/296] ARM: irqstat: Get rid of duplicated declaration -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -irq_cpustat_t is exactly the same as the asm-generic one. Define -ack_bad_irq so the generic header does not emit the generic version of it. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Reviewed-by: Frederic Weisbecker <frederic@kernel.org> -Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> -Link: https://lore.kernel.org/r/20201113141733.276505871@linutronix.de -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/arm/include/asm/hardirq.h | 11 +++-------- - arch/arm/include/asm/irq.h | 2 ++ - 2 files changed, 5 insertions(+), 8 deletions(-) - -diff --git a/arch/arm/include/asm/hardirq.h b/arch/arm/include/asm/hardirq.h -index b95848ed2bc7..706efafbf972 100644 ---- a/arch/arm/include/asm/hardirq.h -+++ b/arch/arm/include/asm/hardirq.h -@@ -2,16 +2,11 @@ - #ifndef __ASM_HARDIRQ_H - #define __ASM_HARDIRQ_H - --#include <linux/cache.h> --#include <linux/threads.h> - #include <asm/irq.h> - --typedef struct { -- unsigned int __softirq_pending; --} ____cacheline_aligned irq_cpustat_t; -- --#include <linux/irq_cpustat.h> /* Standard mappings for irq_cpustat_t above */ -- - #define __ARCH_IRQ_EXIT_IRQS_DISABLED 1 -+#define ack_bad_irq ack_bad_irq -+ -+#include <asm-generic/hardirq.h> - - #endif /* __ASM_HARDIRQ_H */ -diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h -index 46d41140df27..1cbcc462b07e 100644 ---- a/arch/arm/include/asm/irq.h -+++ b/arch/arm/include/asm/irq.h -@@ -31,6 +31,8 @@ void handle_IRQ(unsigned int, struct pt_regs *); - void init_IRQ(void); - - #ifdef CONFIG_SMP -+#include <linux/cpumask.h> -+ - extern void arch_trigger_cpumask_backtrace(const cpumask_t *mask, - bool exclude_self); - #define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace --- -2.30.2 - diff --git a/debian/patches-rt/0121-arm64-irqstat-Get-rid-of-duplicated-declaration.patch b/debian/patches-rt/0121-arm64-irqstat-Get-rid-of-duplicated-declaration.patch deleted file mode 100644 index 7679cfb57..000000000 --- a/debian/patches-rt/0121-arm64-irqstat-Get-rid-of-duplicated-declaration.patch +++ /dev/null @@ -1,40 +0,0 @@ -From 0646937c519a4133a107c99a4a54a41aaf885e9f Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Fri, 13 Nov 2020 15:02:13 +0100 -Subject: [PATCH 121/296] arm64: irqstat: Get rid of duplicated declaration -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -irq_cpustat_t is exactly the same as the asm-generic one. Define -ack_bad_irq so the generic header does not emit the generic version of it. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Reviewed-by: Frederic Weisbecker <frederic@kernel.org> -Acked-by: Will Deacon <will@kernel.org> -Acked-by: Marc Zyngier <maz@kernel.org> -Link: https://lore.kernel.org/r/20201113141733.392015387@linutronix.de -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/arm64/include/asm/hardirq.h | 7 ++----- - 1 file changed, 2 insertions(+), 5 deletions(-) - -diff --git a/arch/arm64/include/asm/hardirq.h b/arch/arm64/include/asm/hardirq.h -index 5ffa4bacdad3..cbfa7b6f2e09 100644 ---- a/arch/arm64/include/asm/hardirq.h -+++ b/arch/arm64/include/asm/hardirq.h -@@ -13,11 +13,8 @@ - #include <asm/kvm_arm.h> - #include <asm/sysreg.h> - --typedef struct { -- unsigned int __softirq_pending; --} ____cacheline_aligned irq_cpustat_t; -- --#include <linux/irq_cpustat.h> /* Standard mappings for irq_cpustat_t above */ -+#define ack_bad_irq ack_bad_irq -+#include <asm-generic/hardirq.h> - - #define __ARCH_IRQ_EXIT_IRQS_DISABLED 1 - --- -2.30.2 - diff --git a/debian/patches-rt/0122-asm-generic-irqstat-Add-optional-__nmi_count-member.patch b/debian/patches-rt/0122-asm-generic-irqstat-Add-optional-__nmi_count-member.patch deleted file mode 100644 index 6b84aa70b..000000000 --- a/debian/patches-rt/0122-asm-generic-irqstat-Add-optional-__nmi_count-member.patch +++ /dev/null @@ -1,34 +0,0 @@ -From 43e7dc9203020a94a208be644e979265eb2c0398 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Fri, 13 Nov 2020 15:02:14 +0100 -Subject: [PATCH 122/296] asm-generic/irqstat: Add optional __nmi_count member -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Add an optional __nmi_count member to irq_cpustat_t so more architectures -can use the generic version. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Reviewed-by: Frederic Weisbecker <frederic@kernel.org> -Link: https://lore.kernel.org/r/20201113141733.501611990@linutronix.de -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/asm-generic/hardirq.h | 3 +++ - 1 file changed, 3 insertions(+) - -diff --git a/include/asm-generic/hardirq.h b/include/asm-generic/hardirq.h -index d14214dfc10b..f5dd99781e3c 100644 ---- a/include/asm-generic/hardirq.h -+++ b/include/asm-generic/hardirq.h -@@ -7,6 +7,9 @@ - - typedef struct { - unsigned int __softirq_pending; -+#ifdef ARCH_WANTS_NMI_IRQSTAT -+ unsigned int __nmi_count; -+#endif - } ____cacheline_aligned irq_cpustat_t; - - #include <linux/irq_cpustat.h> /* Standard mappings for irq_cpustat_t above */ --- -2.30.2 - diff --git a/debian/patches-rt/0123-sh-irqstat-Use-the-generic-irq_cpustat_t.patch b/debian/patches-rt/0123-sh-irqstat-Use-the-generic-irq_cpustat_t.patch deleted file mode 100644 index 495c4f7e2..000000000 --- a/debian/patches-rt/0123-sh-irqstat-Use-the-generic-irq_cpustat_t.patch +++ /dev/null @@ -1,45 +0,0 @@ -From a29eee1b3a56f674706ec295b5dbfc4079e2ad90 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Fri, 13 Nov 2020 15:02:15 +0100 -Subject: [PATCH 123/296] sh: irqstat: Use the generic irq_cpustat_t -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -SH can now use the generic irq_cpustat_t. Define ack_bad_irq so the generic -header does not emit the generic version of it. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Reviewed-by: Frederic Weisbecker <frederic@kernel.org> -Link: https://lore.kernel.org/r/20201113141733.625146223@linutronix.de -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/sh/include/asm/hardirq.h | 14 ++++---------- - 1 file changed, 4 insertions(+), 10 deletions(-) - -diff --git a/arch/sh/include/asm/hardirq.h b/arch/sh/include/asm/hardirq.h -index edaea3559a23..9fe4495a8e90 100644 ---- a/arch/sh/include/asm/hardirq.h -+++ b/arch/sh/include/asm/hardirq.h -@@ -2,16 +2,10 @@ - #ifndef __ASM_SH_HARDIRQ_H - #define __ASM_SH_HARDIRQ_H - --#include <linux/threads.h> --#include <linux/irq.h> -- --typedef struct { -- unsigned int __softirq_pending; -- unsigned int __nmi_count; /* arch dependent */ --} ____cacheline_aligned irq_cpustat_t; -- --#include <linux/irq_cpustat.h> /* Standard mappings for irq_cpustat_t above */ -- - extern void ack_bad_irq(unsigned int irq); -+#define ack_bad_irq ack_bad_irq -+#define ARCH_WANTS_NMI_IRQSTAT -+ -+#include <asm-generic/hardirq.h> - - #endif /* __ASM_SH_HARDIRQ_H */ --- -2.30.2 - diff --git a/debian/patches-rt/0124-irqstat-Move-declaration-into-asm-generic-hardirq.h.patch b/debian/patches-rt/0124-irqstat-Move-declaration-into-asm-generic-hardirq.h.patch deleted file mode 100644 index 4f98c35a7..000000000 --- a/debian/patches-rt/0124-irqstat-Move-declaration-into-asm-generic-hardirq.h.patch +++ /dev/null @@ -1,66 +0,0 @@ -From 02cc63713c8cd4e983df04729953eb649e1878f5 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Fri, 13 Nov 2020 15:02:16 +0100 -Subject: [PATCH 124/296] irqstat: Move declaration into asm-generic/hardirq.h -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Move the declaration of the irq_cpustat per cpu variable to -asm-generic/hardirq.h and remove the now empty linux/irq_cpustat.h header. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Reviewed-by: Frederic Weisbecker <frederic@kernel.org> -Link: https://lore.kernel.org/r/20201113141733.737377332@linutronix.de -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/asm-generic/hardirq.h | 3 ++- - include/linux/irq_cpustat.h | 24 ------------------------ - 2 files changed, 2 insertions(+), 25 deletions(-) - delete mode 100644 include/linux/irq_cpustat.h - -diff --git a/include/asm-generic/hardirq.h b/include/asm-generic/hardirq.h -index f5dd99781e3c..7317e8258b48 100644 ---- a/include/asm-generic/hardirq.h -+++ b/include/asm-generic/hardirq.h -@@ -12,7 +12,8 @@ typedef struct { - #endif - } ____cacheline_aligned irq_cpustat_t; - --#include <linux/irq_cpustat.h> /* Standard mappings for irq_cpustat_t above */ -+DECLARE_PER_CPU_ALIGNED(irq_cpustat_t, irq_stat); -+ - #include <linux/irq.h> - - #ifndef ack_bad_irq -diff --git a/include/linux/irq_cpustat.h b/include/linux/irq_cpustat.h -deleted file mode 100644 -index 78fb2de3ea4d..000000000000 ---- a/include/linux/irq_cpustat.h -+++ /dev/null -@@ -1,24 +0,0 @@ --/* SPDX-License-Identifier: GPL-2.0 */ --#ifndef __irq_cpustat_h --#define __irq_cpustat_h -- --/* -- * Contains default mappings for irq_cpustat_t, used by almost every -- * architecture. Some arch (like s390) have per cpu hardware pages and -- * they define their own mappings for irq_stat. -- * -- * Keith Owens <kaos@ocs.com.au> July 2000. -- */ -- -- --/* -- * Simple wrappers reducing source bloat. Define all irq_stat fields -- * here, even ones that are arch dependent. That way we get common -- * definitions instead of differing sets for each arch. -- */ -- --#ifndef __ARCH_IRQ_STAT --DECLARE_PER_CPU_ALIGNED(irq_cpustat_t, irq_stat); /* defined in asm/hardirq.h */ --#endif -- --#endif /* __irq_cpustat_h */ --- -2.30.2 - diff --git a/debian/patches-rt/0125-preempt-Cleanup-the-macro-maze-a-bit.patch b/debian/patches-rt/0125-preempt-Cleanup-the-macro-maze-a-bit.patch deleted file mode 100644 index fa9c5eda6..000000000 --- a/debian/patches-rt/0125-preempt-Cleanup-the-macro-maze-a-bit.patch +++ /dev/null @@ -1,78 +0,0 @@ -From 416b6bcc44aa455377894c326d5d0e6fa967494a Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Fri, 13 Nov 2020 15:02:17 +0100 -Subject: [PATCH 125/296] preempt: Cleanup the macro maze a bit -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Make the macro maze consistent and prepare it for adding the RT variant for -BH accounting. - - - Use nmi_count() for the NMI portion of preempt count - - Introduce in_hardirq() to make the naming consistent and non-ambiguos - - Use the macros to create combined checks (e.g. in_task()) so the - softirq representation for RT just falls into place. - - Update comments and move the deprecated macros aside - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Reviewed-by: Frederic Weisbecker <frederic@kernel.org> -Link: https://lore.kernel.org/r/20201113141733.864469886@linutronix.de -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/preempt.h | 30 ++++++++++++++++-------------- - 1 file changed, 16 insertions(+), 14 deletions(-) - -diff --git a/include/linux/preempt.h b/include/linux/preempt.h -index 6df63cbe8bb0..69cc8b64aa3a 100644 ---- a/include/linux/preempt.h -+++ b/include/linux/preempt.h -@@ -77,31 +77,33 @@ - /* preempt_count() and related functions, depends on PREEMPT_NEED_RESCHED */ - #include <asm/preempt.h> - -+#define nmi_count() (preempt_count() & NMI_MASK) - #define hardirq_count() (preempt_count() & HARDIRQ_MASK) - #define softirq_count() (preempt_count() & SOFTIRQ_MASK) --#define irq_count() (preempt_count() & (HARDIRQ_MASK | SOFTIRQ_MASK \ -- | NMI_MASK)) -+#define irq_count() (nmi_count() | hardirq_count() | softirq_count()) - - /* -- * Are we doing bottom half or hardware interrupt processing? -+ * Macros to retrieve the current execution context: - * -- * in_irq() - We're in (hard) IRQ context -+ * in_nmi() - We're in NMI context -+ * in_hardirq() - We're in hard IRQ context -+ * in_serving_softirq() - We're in softirq context -+ * in_task() - We're in task context -+ */ -+#define in_nmi() (nmi_count()) -+#define in_hardirq() (hardirq_count()) -+#define in_serving_softirq() (softirq_count() & SOFTIRQ_OFFSET) -+#define in_task() (!(in_nmi() | in_hardirq() | in_serving_softirq())) -+ -+/* -+ * The following macros are deprecated and should not be used in new code: -+ * in_irq() - Obsolete version of in_hardirq() - * in_softirq() - We have BH disabled, or are processing softirqs - * in_interrupt() - We're in NMI,IRQ,SoftIRQ context or have BH disabled -- * in_serving_softirq() - We're in softirq context -- * in_nmi() - We're in NMI context -- * in_task() - We're in task context -- * -- * Note: due to the BH disabled confusion: in_softirq(),in_interrupt() really -- * should not be used in new code. - */ - #define in_irq() (hardirq_count()) - #define in_softirq() (softirq_count()) - #define in_interrupt() (irq_count()) --#define in_serving_softirq() (softirq_count() & SOFTIRQ_OFFSET) --#define in_nmi() (preempt_count() & NMI_MASK) --#define in_task() (!(preempt_count() & \ -- (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET))) - - /* - * The preempt_count offset after preempt_disable(); --- -2.30.2 - diff --git a/debian/patches-rt/0126-softirq-Move-related-code-into-one-section.patch b/debian/patches-rt/0126-softirq-Move-related-code-into-one-section.patch deleted file mode 100644 index 5e76fbf04..000000000 --- a/debian/patches-rt/0126-softirq-Move-related-code-into-one-section.patch +++ /dev/null @@ -1,169 +0,0 @@ -From a3483c99d6edac7ef24369fae406b2655090b2b0 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Fri, 13 Nov 2020 15:02:18 +0100 -Subject: [PATCH 126/296] softirq: Move related code into one section -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -To prepare for adding a RT aware variant of softirq serialization and -processing move related code into one section so the necessary #ifdeffery -is reduced to one. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Reviewed-by: Frederic Weisbecker <frederic@kernel.org> -Link: https://lore.kernel.org/r/20201113141733.974214480@linutronix.de -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/softirq.c | 107 ++++++++++++++++++++++++----------------------- - 1 file changed, 54 insertions(+), 53 deletions(-) - -diff --git a/kernel/softirq.c b/kernel/softirq.c -index 09229ad82209..617009ccd82c 100644 ---- a/kernel/softirq.c -+++ b/kernel/softirq.c -@@ -92,6 +92,13 @@ static bool ksoftirqd_running(unsigned long pending) - !__kthread_should_park(tsk); - } - -+#ifdef CONFIG_TRACE_IRQFLAGS -+DEFINE_PER_CPU(int, hardirqs_enabled); -+DEFINE_PER_CPU(int, hardirq_context); -+EXPORT_PER_CPU_SYMBOL_GPL(hardirqs_enabled); -+EXPORT_PER_CPU_SYMBOL_GPL(hardirq_context); -+#endif -+ - /* - * preempt_count and SOFTIRQ_OFFSET usage: - * - preempt_count is changed by SOFTIRQ_OFFSET on entering or leaving -@@ -102,17 +109,11 @@ static bool ksoftirqd_running(unsigned long pending) - * softirq and whether we just have bh disabled. - */ - -+#ifdef CONFIG_TRACE_IRQFLAGS - /* -- * This one is for softirq.c-internal use, -- * where hardirqs are disabled legitimately: -+ * This is for softirq.c-internal use, where hardirqs are disabled -+ * legitimately: - */ --#ifdef CONFIG_TRACE_IRQFLAGS -- --DEFINE_PER_CPU(int, hardirqs_enabled); --DEFINE_PER_CPU(int, hardirq_context); --EXPORT_PER_CPU_SYMBOL_GPL(hardirqs_enabled); --EXPORT_PER_CPU_SYMBOL_GPL(hardirq_context); -- - void __local_bh_disable_ip(unsigned long ip, unsigned int cnt) - { - unsigned long flags; -@@ -203,6 +204,50 @@ void __local_bh_enable_ip(unsigned long ip, unsigned int cnt) - } - EXPORT_SYMBOL(__local_bh_enable_ip); - -+static inline void invoke_softirq(void) -+{ -+ if (ksoftirqd_running(local_softirq_pending())) -+ return; -+ -+ if (!force_irqthreads) { -+#ifdef CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK -+ /* -+ * We can safely execute softirq on the current stack if -+ * it is the irq stack, because it should be near empty -+ * at this stage. -+ */ -+ __do_softirq(); -+#else -+ /* -+ * Otherwise, irq_exit() is called on the task stack that can -+ * be potentially deep already. So call softirq in its own stack -+ * to prevent from any overrun. -+ */ -+ do_softirq_own_stack(); -+#endif -+ } else { -+ wakeup_softirqd(); -+ } -+} -+ -+asmlinkage __visible void do_softirq(void) -+{ -+ __u32 pending; -+ unsigned long flags; -+ -+ if (in_interrupt()) -+ return; -+ -+ local_irq_save(flags); -+ -+ pending = local_softirq_pending(); -+ -+ if (pending && !ksoftirqd_running(pending)) -+ do_softirq_own_stack(); -+ -+ local_irq_restore(flags); -+} -+ - /* - * We restart softirq processing for at most MAX_SOFTIRQ_RESTART times, - * but break the loop if need_resched() is set or after 2 ms. -@@ -327,24 +372,6 @@ asmlinkage __visible void __softirq_entry __do_softirq(void) - current_restore_flags(old_flags, PF_MEMALLOC); - } - --asmlinkage __visible void do_softirq(void) --{ -- __u32 pending; -- unsigned long flags; -- -- if (in_interrupt()) -- return; -- -- local_irq_save(flags); -- -- pending = local_softirq_pending(); -- -- if (pending && !ksoftirqd_running(pending)) -- do_softirq_own_stack(); -- -- local_irq_restore(flags); --} -- - /** - * irq_enter_rcu - Enter an interrupt context with RCU watching - */ -@@ -371,32 +398,6 @@ void irq_enter(void) - irq_enter_rcu(); - } - --static inline void invoke_softirq(void) --{ -- if (ksoftirqd_running(local_softirq_pending())) -- return; -- -- if (!force_irqthreads) { --#ifdef CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK -- /* -- * We can safely execute softirq on the current stack if -- * it is the irq stack, because it should be near empty -- * at this stage. -- */ -- __do_softirq(); --#else -- /* -- * Otherwise, irq_exit() is called on the task stack that can -- * be potentially deep already. So call softirq in its own stack -- * to prevent from any overrun. -- */ -- do_softirq_own_stack(); --#endif -- } else { -- wakeup_softirqd(); -- } --} -- - static inline void tick_irq_exit(void) - { - #ifdef CONFIG_NO_HZ_COMMON --- -2.30.2 - diff --git a/debian/patches-rt/0127-sh-irq-Add-missing-closing-parentheses-in-arch_show_.patch b/debian/patches-rt/0127-sh-irq-Add-missing-closing-parentheses-in-arch_show_.patch deleted file mode 100644 index 0e09e2ecd..000000000 --- a/debian/patches-rt/0127-sh-irq-Add-missing-closing-parentheses-in-arch_show_.patch +++ /dev/null @@ -1,40 +0,0 @@ -From 65ca25560962f560dca2ec51f29a3d3585a6ebe1 Mon Sep 17 00:00:00 2001 -From: Geert Uytterhoeven <geert+renesas@glider.be> -Date: Tue, 24 Nov 2020 14:06:56 +0100 -Subject: [PATCH 127/296] sh/irq: Add missing closing parentheses in - arch_show_interrupts() -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - - arch/sh/kernel/irq.c: In function ‘arch_show_interrupts’: - arch/sh/kernel/irq.c:47:58: error: expected ‘)’ before ‘;’ token - 47 | seq_printf(p, "%10u ", per_cpu(irq_stat.__nmi_count, j); - | ^ - -Fixes: fe3f1d5d7cd3062c ("sh: Get rid of nmi_count()") -Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Link: https://lore.kernel.org/r/20201124130656.2741743-1-geert+renesas@glider.be -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/sh/kernel/irq.c | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/arch/sh/kernel/irq.c b/arch/sh/kernel/irq.c -index 5addcb2c2da0..ab5f790b0cd2 100644 ---- a/arch/sh/kernel/irq.c -+++ b/arch/sh/kernel/irq.c -@@ -44,7 +44,7 @@ int arch_show_interrupts(struct seq_file *p, int prec) - - seq_printf(p, "%*s: ", prec, "NMI"); - for_each_online_cpu(j) -- seq_printf(p, "%10u ", per_cpu(irq_stat.__nmi_count, j); -+ seq_printf(p, "%10u ", per_cpu(irq_stat.__nmi_count, j)); - seq_printf(p, " Non-maskable interrupts\n"); - - seq_printf(p, "%*s: %10u\n", prec, "ERR", atomic_read(&irq_err_count)); --- -2.30.2 - diff --git a/debian/patches-rt/0128-sched-cputime-Remove-symbol-exports-from-IRQ-time-ac.patch b/debian/patches-rt/0128-sched-cputime-Remove-symbol-exports-from-IRQ-time-ac.patch deleted file mode 100644 index ec214274f..000000000 --- a/debian/patches-rt/0128-sched-cputime-Remove-symbol-exports-from-IRQ-time-ac.patch +++ /dev/null @@ -1,73 +0,0 @@ -From a805322d0dcd5d64dc33a4de70545c5025b5262f Mon Sep 17 00:00:00 2001 -From: Frederic Weisbecker <frederic@kernel.org> -Date: Wed, 2 Dec 2020 12:57:28 +0100 -Subject: [PATCH 128/296] sched/cputime: Remove symbol exports from IRQ time - accounting -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -account_irq_enter_time() and account_irq_exit_time() are not called -from modules. EXPORT_SYMBOL_GPL() can be safely removed from the IRQ -cputime accounting functions called from there. - -Signed-off-by: Frederic Weisbecker <frederic@kernel.org> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Link: https://lore.kernel.org/r/20201202115732.27827-2-frederic@kernel.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/s390/kernel/vtime.c | 10 +++++----- - kernel/sched/cputime.c | 2 -- - 2 files changed, 5 insertions(+), 7 deletions(-) - -diff --git a/arch/s390/kernel/vtime.c b/arch/s390/kernel/vtime.c -index 579ec3a8c816..710135905deb 100644 ---- a/arch/s390/kernel/vtime.c -+++ b/arch/s390/kernel/vtime.c -@@ -227,7 +227,7 @@ void vtime_flush(struct task_struct *tsk) - * Update process times based on virtual cpu times stored by entry.S - * to the lowcore fields user_timer, system_timer & steal_clock. - */ --void vtime_account_irq_enter(struct task_struct *tsk) -+void vtime_account_kernel(struct task_struct *tsk) - { - u64 timer; - -@@ -246,12 +246,12 @@ void vtime_account_irq_enter(struct task_struct *tsk) - - virt_timer_forward(timer); - } --EXPORT_SYMBOL_GPL(vtime_account_irq_enter); -- --void vtime_account_kernel(struct task_struct *tsk) --__attribute__((alias("vtime_account_irq_enter"))); - EXPORT_SYMBOL_GPL(vtime_account_kernel); - -+void vtime_account_irq_enter(struct task_struct *tsk) -+__attribute__((alias("vtime_account_kernel"))); -+ -+ - /* - * Sorted add to a list. List is linear searched until first bigger - * element is found. -diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c -index 5a55d2300452..61ce9f9bf0a3 100644 ---- a/kernel/sched/cputime.c -+++ b/kernel/sched/cputime.c -@@ -71,7 +71,6 @@ void irqtime_account_irq(struct task_struct *curr) - else if (in_serving_softirq() && curr != this_cpu_ksoftirqd()) - irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ); - } --EXPORT_SYMBOL_GPL(irqtime_account_irq); - - static u64 irqtime_tick_accounted(u64 maxtime) - { -@@ -434,7 +433,6 @@ void vtime_account_irq_enter(struct task_struct *tsk) - else - vtime_account_kernel(tsk); - } --EXPORT_SYMBOL_GPL(vtime_account_irq_enter); - #endif /* __ARCH_HAS_VTIME_ACCOUNT */ - - void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev, --- -2.30.2 - diff --git a/debian/patches-rt/0129-s390-vtime-Use-the-generic-IRQ-entry-accounting.patch b/debian/patches-rt/0129-s390-vtime-Use-the-generic-IRQ-entry-accounting.patch deleted file mode 100644 index 4904b9d3f..000000000 --- a/debian/patches-rt/0129-s390-vtime-Use-the-generic-IRQ-entry-accounting.patch +++ /dev/null @@ -1,126 +0,0 @@ -From e08f88a48f3a616454b1c8d25857ad6a481b7e61 Mon Sep 17 00:00:00 2001 -From: Frederic Weisbecker <frederic@kernel.org> -Date: Wed, 2 Dec 2020 12:57:29 +0100 -Subject: [PATCH 129/296] s390/vtime: Use the generic IRQ entry accounting -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -s390 has its own version of IRQ entry accounting because it doesn't -account the idle time the same way the other architectures do. Only -the actual idle sleep time is accounted as idle time, the rest of the -idle task execution is accounted as system time. - -Make the generic IRQ entry accounting aware of architectures that have -their own way of accounting idle time and convert s390 to use it. - -This prepares s390 to get involved in further consolidations of IRQ -time accounting. - -Signed-off-by: Frederic Weisbecker <frederic@kernel.org> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Link: https://lore.kernel.org/r/20201202115732.27827-3-frederic@kernel.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/Kconfig | 7 ++++++- - arch/s390/Kconfig | 1 + - arch/s390/include/asm/vtime.h | 1 - - arch/s390/kernel/vtime.c | 4 ---- - kernel/sched/cputime.c | 13 ++----------- - 5 files changed, 9 insertions(+), 17 deletions(-) - -diff --git a/arch/Kconfig b/arch/Kconfig -index 69fe7133c765..3a036011ac8d 100644 ---- a/arch/Kconfig -+++ b/arch/Kconfig -@@ -643,6 +643,12 @@ config HAVE_TIF_NOHZ - config HAVE_VIRT_CPU_ACCOUNTING - bool - -+config HAVE_VIRT_CPU_ACCOUNTING_IDLE -+ bool -+ help -+ Architecture has its own way to account idle CPU time and therefore -+ doesn't implement vtime_account_idle(). -+ - config ARCH_HAS_SCALED_CPUTIME - bool - -@@ -657,7 +663,6 @@ config HAVE_VIRT_CPU_ACCOUNTING_GEN - some 32-bit arches may require multiple accesses, so proper - locking is needed to protect against concurrent accesses. - -- - config HAVE_IRQ_TIME_ACCOUNTING - bool - help -diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig -index 4a2a12be04c9..6f1fdcd3b5db 100644 ---- a/arch/s390/Kconfig -+++ b/arch/s390/Kconfig -@@ -181,6 +181,7 @@ config S390 - select HAVE_RSEQ - select HAVE_SYSCALL_TRACEPOINTS - select HAVE_VIRT_CPU_ACCOUNTING -+ select HAVE_VIRT_CPU_ACCOUNTING_IDLE - select IOMMU_HELPER if PCI - select IOMMU_SUPPORT if PCI - select MODULES_USE_ELF_RELA -diff --git a/arch/s390/include/asm/vtime.h b/arch/s390/include/asm/vtime.h -index 3622d4ebc73a..fac6a67988eb 100644 ---- a/arch/s390/include/asm/vtime.h -+++ b/arch/s390/include/asm/vtime.h -@@ -2,7 +2,6 @@ - #ifndef _S390_VTIME_H - #define _S390_VTIME_H - --#define __ARCH_HAS_VTIME_ACCOUNT - #define __ARCH_HAS_VTIME_TASK_SWITCH - - #endif /* _S390_VTIME_H */ -diff --git a/arch/s390/kernel/vtime.c b/arch/s390/kernel/vtime.c -index 710135905deb..18a97631af43 100644 ---- a/arch/s390/kernel/vtime.c -+++ b/arch/s390/kernel/vtime.c -@@ -248,10 +248,6 @@ void vtime_account_kernel(struct task_struct *tsk) - } - EXPORT_SYMBOL_GPL(vtime_account_kernel); - --void vtime_account_irq_enter(struct task_struct *tsk) --__attribute__((alias("vtime_account_kernel"))); -- -- - /* - * Sorted add to a list. List is linear searched until first bigger - * element is found. -diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c -index 61ce9f9bf0a3..2783162542b1 100644 ---- a/kernel/sched/cputime.c -+++ b/kernel/sched/cputime.c -@@ -417,23 +417,14 @@ void vtime_task_switch(struct task_struct *prev) - } - # endif - --/* -- * Archs that account the whole time spent in the idle task -- * (outside irq) as idle time can rely on this and just implement -- * vtime_account_kernel() and vtime_account_idle(). Archs that -- * have other meaning of the idle time (s390 only includes the -- * time spent by the CPU when it's in low power mode) must override -- * vtime_account(). -- */ --#ifndef __ARCH_HAS_VTIME_ACCOUNT - void vtime_account_irq_enter(struct task_struct *tsk) - { -- if (!in_interrupt() && is_idle_task(tsk)) -+ if (!IS_ENABLED(CONFIG_HAVE_VIRT_CPU_ACCOUNTING_IDLE) && -+ !in_interrupt() && is_idle_task(tsk)) - vtime_account_idle(tsk); - else - vtime_account_kernel(tsk); - } --#endif /* __ARCH_HAS_VTIME_ACCOUNT */ - - void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev, - u64 *ut, u64 *st) --- -2.30.2 - diff --git a/debian/patches-rt/0130-sched-vtime-Consolidate-IRQ-time-accounting.patch b/debian/patches-rt/0130-sched-vtime-Consolidate-IRQ-time-accounting.patch deleted file mode 100644 index c15cc7bab..000000000 --- a/debian/patches-rt/0130-sched-vtime-Consolidate-IRQ-time-accounting.patch +++ /dev/null @@ -1,303 +0,0 @@ -From 4775fe014a0280ec7eee444e5144b01769076d35 Mon Sep 17 00:00:00 2001 -From: Frederic Weisbecker <frederic@kernel.org> -Date: Wed, 2 Dec 2020 12:57:30 +0100 -Subject: [PATCH 130/296] sched/vtime: Consolidate IRQ time accounting -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The 3 architectures implementing CONFIG_VIRT_CPU_ACCOUNTING_NATIVE -all have their own version of irq time accounting that dispatch the -cputime to the appropriate index: hardirq, softirq, system, idle, -guest... from an all-in-one function. - -Instead of having these ad-hoc versions, move the cputime destination -dispatch decision to the core code and leave only the actual per-index -cputime accounting to the architecture. - -Signed-off-by: Frederic Weisbecker <frederic@kernel.org> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Link: https://lore.kernel.org/r/20201202115732.27827-4-frederic@kernel.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/ia64/kernel/time.c | 20 ++++++++++---- - arch/powerpc/kernel/time.c | 56 +++++++++++++++++++++++++++----------- - arch/s390/kernel/vtime.c | 45 +++++++++++++++++++++--------- - include/linux/vtime.h | 16 ++++------- - kernel/sched/cputime.c | 13 ++++++--- - 5 files changed, 102 insertions(+), 48 deletions(-) - -diff --git a/arch/ia64/kernel/time.c b/arch/ia64/kernel/time.c -index 7abc5f37bfaf..733e0e3324b8 100644 ---- a/arch/ia64/kernel/time.c -+++ b/arch/ia64/kernel/time.c -@@ -138,12 +138,8 @@ void vtime_account_kernel(struct task_struct *tsk) - struct thread_info *ti = task_thread_info(tsk); - __u64 stime = vtime_delta(tsk); - -- if ((tsk->flags & PF_VCPU) && !irq_count()) -+ if (tsk->flags & PF_VCPU) - ti->gtime += stime; -- else if (hardirq_count()) -- ti->hardirq_time += stime; -- else if (in_serving_softirq()) -- ti->softirq_time += stime; - else - ti->stime += stime; - } -@@ -156,6 +152,20 @@ void vtime_account_idle(struct task_struct *tsk) - ti->idle_time += vtime_delta(tsk); - } - -+void vtime_account_softirq(struct task_struct *tsk) -+{ -+ struct thread_info *ti = task_thread_info(tsk); -+ -+ ti->softirq_time += vtime_delta(tsk); -+} -+ -+void vtime_account_hardirq(struct task_struct *tsk) -+{ -+ struct thread_info *ti = task_thread_info(tsk); -+ -+ ti->hardirq_time += vtime_delta(tsk); -+} -+ - #endif /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */ - - static irqreturn_t -diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c -index 1d20f0f77a92..7e0a497a36ee 100644 ---- a/arch/powerpc/kernel/time.c -+++ b/arch/powerpc/kernel/time.c -@@ -312,12 +312,11 @@ static unsigned long vtime_delta_scaled(struct cpu_accounting_data *acct, - return stime_scaled; - } - --static unsigned long vtime_delta(struct task_struct *tsk, -+static unsigned long vtime_delta(struct cpu_accounting_data *acct, - unsigned long *stime_scaled, - unsigned long *steal_time) - { - unsigned long now, stime; -- struct cpu_accounting_data *acct = get_accounting(tsk); - - WARN_ON_ONCE(!irqs_disabled()); - -@@ -332,29 +331,30 @@ static unsigned long vtime_delta(struct task_struct *tsk, - return stime; - } - -+static void vtime_delta_kernel(struct cpu_accounting_data *acct, -+ unsigned long *stime, unsigned long *stime_scaled) -+{ -+ unsigned long steal_time; -+ -+ *stime = vtime_delta(acct, stime_scaled, &steal_time); -+ *stime -= min(*stime, steal_time); -+ acct->steal_time += steal_time; -+} -+ - void vtime_account_kernel(struct task_struct *tsk) - { -- unsigned long stime, stime_scaled, steal_time; - struct cpu_accounting_data *acct = get_accounting(tsk); -+ unsigned long stime, stime_scaled; - -- stime = vtime_delta(tsk, &stime_scaled, &steal_time); -- -- stime -= min(stime, steal_time); -- acct->steal_time += steal_time; -+ vtime_delta_kernel(acct, &stime, &stime_scaled); - -- if ((tsk->flags & PF_VCPU) && !irq_count()) { -+ if (tsk->flags & PF_VCPU) { - acct->gtime += stime; - #ifdef CONFIG_ARCH_HAS_SCALED_CPUTIME - acct->utime_scaled += stime_scaled; - #endif - } else { -- if (hardirq_count()) -- acct->hardirq_time += stime; -- else if (in_serving_softirq()) -- acct->softirq_time += stime; -- else -- acct->stime += stime; -- -+ acct->stime += stime; - #ifdef CONFIG_ARCH_HAS_SCALED_CPUTIME - acct->stime_scaled += stime_scaled; - #endif -@@ -367,10 +367,34 @@ void vtime_account_idle(struct task_struct *tsk) - unsigned long stime, stime_scaled, steal_time; - struct cpu_accounting_data *acct = get_accounting(tsk); - -- stime = vtime_delta(tsk, &stime_scaled, &steal_time); -+ stime = vtime_delta(acct, &stime_scaled, &steal_time); - acct->idle_time += stime + steal_time; - } - -+static void vtime_account_irq_field(struct cpu_accounting_data *acct, -+ unsigned long *field) -+{ -+ unsigned long stime, stime_scaled; -+ -+ vtime_delta_kernel(acct, &stime, &stime_scaled); -+ *field += stime; -+#ifdef CONFIG_ARCH_HAS_SCALED_CPUTIME -+ acct->stime_scaled += stime_scaled; -+#endif -+} -+ -+void vtime_account_softirq(struct task_struct *tsk) -+{ -+ struct cpu_accounting_data *acct = get_accounting(tsk); -+ vtime_account_irq_field(acct, &acct->softirq_time); -+} -+ -+void vtime_account_hardirq(struct task_struct *tsk) -+{ -+ struct cpu_accounting_data *acct = get_accounting(tsk); -+ vtime_account_irq_field(acct, &acct->hardirq_time); -+} -+ - static void vtime_flush_scaled(struct task_struct *tsk, - struct cpu_accounting_data *acct) - { -diff --git a/arch/s390/kernel/vtime.c b/arch/s390/kernel/vtime.c -index 18a97631af43..9b3c5978b668 100644 ---- a/arch/s390/kernel/vtime.c -+++ b/arch/s390/kernel/vtime.c -@@ -223,31 +223,50 @@ void vtime_flush(struct task_struct *tsk) - S390_lowcore.avg_steal_timer = avg_steal; - } - -+static u64 vtime_delta(void) -+{ -+ u64 timer = S390_lowcore.last_update_timer; -+ -+ S390_lowcore.last_update_timer = get_vtimer(); -+ -+ return timer - S390_lowcore.last_update_timer; -+} -+ - /* - * Update process times based on virtual cpu times stored by entry.S - * to the lowcore fields user_timer, system_timer & steal_clock. - */ - void vtime_account_kernel(struct task_struct *tsk) - { -- u64 timer; -- -- timer = S390_lowcore.last_update_timer; -- S390_lowcore.last_update_timer = get_vtimer(); -- timer -= S390_lowcore.last_update_timer; -+ u64 delta = vtime_delta(); - -- if ((tsk->flags & PF_VCPU) && (irq_count() == 0)) -- S390_lowcore.guest_timer += timer; -- else if (hardirq_count()) -- S390_lowcore.hardirq_timer += timer; -- else if (in_serving_softirq()) -- S390_lowcore.softirq_timer += timer; -+ if (tsk->flags & PF_VCPU) -+ S390_lowcore.guest_timer += delta; - else -- S390_lowcore.system_timer += timer; -+ S390_lowcore.system_timer += delta; - -- virt_timer_forward(timer); -+ virt_timer_forward(delta); - } - EXPORT_SYMBOL_GPL(vtime_account_kernel); - -+void vtime_account_softirq(struct task_struct *tsk) -+{ -+ u64 delta = vtime_delta(); -+ -+ S390_lowcore.softirq_timer += delta; -+ -+ virt_timer_forward(delta); -+} -+ -+void vtime_account_hardirq(struct task_struct *tsk) -+{ -+ u64 delta = vtime_delta(); -+ -+ S390_lowcore.hardirq_timer += delta; -+ -+ virt_timer_forward(delta); -+} -+ - /* - * Sorted add to a list. List is linear searched until first bigger - * element is found. -diff --git a/include/linux/vtime.h b/include/linux/vtime.h -index 2cdeca062db3..6c9867419615 100644 ---- a/include/linux/vtime.h -+++ b/include/linux/vtime.h -@@ -83,16 +83,12 @@ static inline void vtime_init_idle(struct task_struct *tsk, int cpu) { } - #endif - - #ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE --extern void vtime_account_irq_enter(struct task_struct *tsk); --static inline void vtime_account_irq_exit(struct task_struct *tsk) --{ -- /* On hard|softirq exit we always account to hard|softirq cputime */ -- vtime_account_kernel(tsk); --} -+extern void vtime_account_irq(struct task_struct *tsk); -+extern void vtime_account_softirq(struct task_struct *tsk); -+extern void vtime_account_hardirq(struct task_struct *tsk); - extern void vtime_flush(struct task_struct *tsk); - #else /* !CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */ --static inline void vtime_account_irq_enter(struct task_struct *tsk) { } --static inline void vtime_account_irq_exit(struct task_struct *tsk) { } -+static inline void vtime_account_irq(struct task_struct *tsk) { } - static inline void vtime_flush(struct task_struct *tsk) { } - #endif - -@@ -105,13 +101,13 @@ static inline void irqtime_account_irq(struct task_struct *tsk) { } - - static inline void account_irq_enter_time(struct task_struct *tsk) - { -- vtime_account_irq_enter(tsk); -+ vtime_account_irq(tsk); - irqtime_account_irq(tsk); - } - - static inline void account_irq_exit_time(struct task_struct *tsk) - { -- vtime_account_irq_exit(tsk); -+ vtime_account_irq(tsk); - irqtime_account_irq(tsk); - } - -diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c -index 2783162542b1..02163d4260d7 100644 ---- a/kernel/sched/cputime.c -+++ b/kernel/sched/cputime.c -@@ -417,13 +417,18 @@ void vtime_task_switch(struct task_struct *prev) - } - # endif - --void vtime_account_irq_enter(struct task_struct *tsk) -+void vtime_account_irq(struct task_struct *tsk) - { -- if (!IS_ENABLED(CONFIG_HAVE_VIRT_CPU_ACCOUNTING_IDLE) && -- !in_interrupt() && is_idle_task(tsk)) -+ if (hardirq_count()) { -+ vtime_account_hardirq(tsk); -+ } else if (in_serving_softirq()) { -+ vtime_account_softirq(tsk); -+ } else if (!IS_ENABLED(CONFIG_HAVE_VIRT_CPU_ACCOUNTING_IDLE) && -+ is_idle_task(tsk)) { - vtime_account_idle(tsk); -- else -+ } else { - vtime_account_kernel(tsk); -+ } - } - - void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev, --- -2.30.2 - diff --git a/debian/patches-rt/0131-irqtime-Move-irqtime-entry-accounting-after-irq-offs.patch b/debian/patches-rt/0131-irqtime-Move-irqtime-entry-accounting-after-irq-offs.patch deleted file mode 100644 index c1c053119..000000000 --- a/debian/patches-rt/0131-irqtime-Move-irqtime-entry-accounting-after-irq-offs.patch +++ /dev/null @@ -1,213 +0,0 @@ -From fc6569de970c50344e99ed1175ff563278bf65e1 Mon Sep 17 00:00:00 2001 -From: Frederic Weisbecker <frederic@kernel.org> -Date: Wed, 2 Dec 2020 12:57:31 +0100 -Subject: [PATCH 131/296] irqtime: Move irqtime entry accounting after irq - offset incrementation -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -IRQ time entry is currently accounted before HARDIRQ_OFFSET or -SOFTIRQ_OFFSET are incremented. This is convenient to decide to which -index the cputime to account is dispatched. - -Unfortunately it prevents tick_irq_enter() from being called under -HARDIRQ_OFFSET because tick_irq_enter() has to be called before the IRQ -entry accounting due to the necessary clock catch up. As a result we -don't benefit from appropriate lockdep coverage on tick_irq_enter(). - -To prepare for fixing this, move the IRQ entry cputime accounting after -the preempt offset is incremented. This requires the cputime dispatch -code to handle the extra offset. - -Signed-off-by: Frederic Weisbecker <frederic@kernel.org> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> -Link: https://lore.kernel.org/r/20201202115732.27827-5-frederic@kernel.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/hardirq.h | 4 ++-- - include/linux/vtime.h | 34 ++++++++++++++++++++++++---------- - kernel/sched/cputime.c | 18 +++++++++++------- - kernel/softirq.c | 6 +++--- - 4 files changed, 40 insertions(+), 22 deletions(-) - -diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h -index c35b71f8644a..0926e9ca4d85 100644 ---- a/include/linux/hardirq.h -+++ b/include/linux/hardirq.h -@@ -32,9 +32,9 @@ static __always_inline void rcu_irq_enter_check_tick(void) - */ - #define __irq_enter() \ - do { \ -- account_irq_enter_time(current); \ - preempt_count_add(HARDIRQ_OFFSET); \ - lockdep_hardirq_enter(); \ -+ account_hardirq_enter(current); \ - } while (0) - - /* -@@ -62,8 +62,8 @@ void irq_enter_rcu(void); - */ - #define __irq_exit() \ - do { \ -+ account_hardirq_exit(current); \ - lockdep_hardirq_exit(); \ -- account_irq_exit_time(current); \ - preempt_count_sub(HARDIRQ_OFFSET); \ - } while (0) - -diff --git a/include/linux/vtime.h b/include/linux/vtime.h -index 6c9867419615..041d6524d144 100644 ---- a/include/linux/vtime.h -+++ b/include/linux/vtime.h -@@ -83,32 +83,46 @@ static inline void vtime_init_idle(struct task_struct *tsk, int cpu) { } - #endif - - #ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE --extern void vtime_account_irq(struct task_struct *tsk); -+extern void vtime_account_irq(struct task_struct *tsk, unsigned int offset); - extern void vtime_account_softirq(struct task_struct *tsk); - extern void vtime_account_hardirq(struct task_struct *tsk); - extern void vtime_flush(struct task_struct *tsk); - #else /* !CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */ --static inline void vtime_account_irq(struct task_struct *tsk) { } -+static inline void vtime_account_irq(struct task_struct *tsk, unsigned int offset) { } -+static inline void vtime_account_softirq(struct task_struct *tsk) { } -+static inline void vtime_account_hardirq(struct task_struct *tsk) { } - static inline void vtime_flush(struct task_struct *tsk) { } - #endif - - - #ifdef CONFIG_IRQ_TIME_ACCOUNTING --extern void irqtime_account_irq(struct task_struct *tsk); -+extern void irqtime_account_irq(struct task_struct *tsk, unsigned int offset); - #else --static inline void irqtime_account_irq(struct task_struct *tsk) { } -+static inline void irqtime_account_irq(struct task_struct *tsk, unsigned int offset) { } - #endif - --static inline void account_irq_enter_time(struct task_struct *tsk) -+static inline void account_softirq_enter(struct task_struct *tsk) - { -- vtime_account_irq(tsk); -- irqtime_account_irq(tsk); -+ vtime_account_irq(tsk, SOFTIRQ_OFFSET); -+ irqtime_account_irq(tsk, SOFTIRQ_OFFSET); - } - --static inline void account_irq_exit_time(struct task_struct *tsk) -+static inline void account_softirq_exit(struct task_struct *tsk) - { -- vtime_account_irq(tsk); -- irqtime_account_irq(tsk); -+ vtime_account_softirq(tsk); -+ irqtime_account_irq(tsk, 0); -+} -+ -+static inline void account_hardirq_enter(struct task_struct *tsk) -+{ -+ vtime_account_irq(tsk, HARDIRQ_OFFSET); -+ irqtime_account_irq(tsk, HARDIRQ_OFFSET); -+} -+ -+static inline void account_hardirq_exit(struct task_struct *tsk) -+{ -+ vtime_account_hardirq(tsk); -+ irqtime_account_irq(tsk, 0); - } - - #endif /* _LINUX_KERNEL_VTIME_H */ -diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c -index 02163d4260d7..5f611658eeab 100644 ---- a/kernel/sched/cputime.c -+++ b/kernel/sched/cputime.c -@@ -44,12 +44,13 @@ static void irqtime_account_delta(struct irqtime *irqtime, u64 delta, - } - - /* -- * Called before incrementing preempt_count on {soft,}irq_enter -+ * Called after incrementing preempt_count on {soft,}irq_enter - * and before decrementing preempt_count on {soft,}irq_exit. - */ --void irqtime_account_irq(struct task_struct *curr) -+void irqtime_account_irq(struct task_struct *curr, unsigned int offset) - { - struct irqtime *irqtime = this_cpu_ptr(&cpu_irqtime); -+ unsigned int pc; - s64 delta; - int cpu; - -@@ -59,6 +60,7 @@ void irqtime_account_irq(struct task_struct *curr) - cpu = smp_processor_id(); - delta = sched_clock_cpu(cpu) - irqtime->irq_start_time; - irqtime->irq_start_time += delta; -+ pc = preempt_count() - offset; - - /* - * We do not account for softirq time from ksoftirqd here. -@@ -66,9 +68,9 @@ void irqtime_account_irq(struct task_struct *curr) - * in that case, so as not to confuse scheduler with a special task - * that do not consume any time, but still wants to run. - */ -- if (hardirq_count()) -+ if (pc & HARDIRQ_MASK) - irqtime_account_delta(irqtime, delta, CPUTIME_IRQ); -- else if (in_serving_softirq() && curr != this_cpu_ksoftirqd()) -+ else if ((pc & SOFTIRQ_OFFSET) && curr != this_cpu_ksoftirqd()) - irqtime_account_delta(irqtime, delta, CPUTIME_SOFTIRQ); - } - -@@ -417,11 +419,13 @@ void vtime_task_switch(struct task_struct *prev) - } - # endif - --void vtime_account_irq(struct task_struct *tsk) -+void vtime_account_irq(struct task_struct *tsk, unsigned int offset) - { -- if (hardirq_count()) { -+ unsigned int pc = preempt_count() - offset; -+ -+ if (pc & HARDIRQ_OFFSET) { - vtime_account_hardirq(tsk); -- } else if (in_serving_softirq()) { -+ } else if (pc & SOFTIRQ_OFFSET) { - vtime_account_softirq(tsk); - } else if (!IS_ENABLED(CONFIG_HAVE_VIRT_CPU_ACCOUNTING_IDLE) && - is_idle_task(tsk)) { -diff --git a/kernel/softirq.c b/kernel/softirq.c -index 617009ccd82c..b8f42b3ba8ca 100644 ---- a/kernel/softirq.c -+++ b/kernel/softirq.c -@@ -315,10 +315,10 @@ asmlinkage __visible void __softirq_entry __do_softirq(void) - current->flags &= ~PF_MEMALLOC; - - pending = local_softirq_pending(); -- account_irq_enter_time(current); - - __local_bh_disable_ip(_RET_IP_, SOFTIRQ_OFFSET); - in_hardirq = lockdep_softirq_start(); -+ account_softirq_enter(current); - - restart: - /* Reset the pending bitmask before enabling irqs */ -@@ -365,8 +365,8 @@ asmlinkage __visible void __softirq_entry __do_softirq(void) - wakeup_softirqd(); - } - -+ account_softirq_exit(current); - lockdep_softirq_end(in_hardirq); -- account_irq_exit_time(current); - __local_bh_enable(SOFTIRQ_OFFSET); - WARN_ON_ONCE(in_interrupt()); - current_restore_flags(old_flags, PF_MEMALLOC); -@@ -418,7 +418,7 @@ static inline void __irq_exit_rcu(void) - #else - lockdep_assert_irqs_disabled(); - #endif -- account_irq_exit_time(current); -+ account_hardirq_exit(current); - preempt_count_sub(HARDIRQ_OFFSET); - if (!in_interrupt() && local_softirq_pending()) - invoke_softirq(); --- -2.30.2 - diff --git a/debian/patches-rt/0132-irq-Call-tick_irq_enter-inside-HARDIRQ_OFFSET.patch b/debian/patches-rt/0132-irq-Call-tick_irq_enter-inside-HARDIRQ_OFFSET.patch deleted file mode 100644 index d99a0f047..000000000 --- a/debian/patches-rt/0132-irq-Call-tick_irq_enter-inside-HARDIRQ_OFFSET.patch +++ /dev/null @@ -1,51 +0,0 @@ -From 8bb495fa16cad998d270762965890c2688d934c2 Mon Sep 17 00:00:00 2001 -From: Frederic Weisbecker <frederic@kernel.org> -Date: Wed, 2 Dec 2020 12:57:32 +0100 -Subject: [PATCH 132/296] irq: Call tick_irq_enter() inside HARDIRQ_OFFSET -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Now that account_hardirq_enter() is called after HARDIRQ_OFFSET has -been incremented, there is nothing left that prevents us from also -moving tick_irq_enter() after HARDIRQ_OFFSET is incremented. - -The desired outcome is to remove the nasty hack that prevents softirqs -from being raised through ksoftirqd instead of the hardirq bottom half. -Also tick_irq_enter() then becomes appropriately covered by lockdep. - -Signed-off-by: Frederic Weisbecker <frederic@kernel.org> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Link: https://lore.kernel.org/r/20201202115732.27827-6-frederic@kernel.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/softirq.c | 14 +++++--------- - 1 file changed, 5 insertions(+), 9 deletions(-) - -diff --git a/kernel/softirq.c b/kernel/softirq.c -index b8f42b3ba8ca..d5bfd5e661fc 100644 ---- a/kernel/softirq.c -+++ b/kernel/softirq.c -@@ -377,16 +377,12 @@ asmlinkage __visible void __softirq_entry __do_softirq(void) - */ - void irq_enter_rcu(void) - { -- if (is_idle_task(current) && !in_interrupt()) { -- /* -- * Prevent raise_softirq from needlessly waking up ksoftirqd -- * here, as softirq will be serviced on return from interrupt. -- */ -- local_bh_disable(); -+ __irq_enter_raw(); -+ -+ if (is_idle_task(current) && (irq_count() == HARDIRQ_OFFSET)) - tick_irq_enter(); -- _local_bh_enable(); -- } -- __irq_enter(); -+ -+ account_hardirq_enter(current); - } - - /** --- -2.30.2 - diff --git a/debian/patches-rt/0134-net-arcnet-Fix-RESET-flag-handling.patch b/debian/patches-rt/0134-net-arcnet-Fix-RESET-flag-handling.patch deleted file mode 100644 index 40b434c29..000000000 --- a/debian/patches-rt/0134-net-arcnet-Fix-RESET-flag-handling.patch +++ /dev/null @@ -1,311 +0,0 @@ -From c2f66dc3c8f543d063d97b003766809857f320c2 Mon Sep 17 00:00:00 2001 -From: "Ahmed S. Darwish" <a.darwish@linutronix.de> -Date: Thu, 28 Jan 2021 20:48:02 +0100 -Subject: [PATCH 134/296] net: arcnet: Fix RESET flag handling -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The main arcnet interrupt handler calls arcnet_close() then -arcnet_open(), if the RESET status flag is encountered. - -This is invalid: - - 1) In general, interrupt handlers should never call ->ndo_stop() and - ->ndo_open() functions. They are usually full of blocking calls and - other methods that are expected to be called only from drivers - init and exit code paths. - - 2) arcnet_close() contains a del_timer_sync(). If the irq handler - interrupts the to-be-deleted timer, del_timer_sync() will just loop - forever. - - 3) arcnet_close() also calls tasklet_kill(), which has a warning if - called from irq context. - - 4) For device reset, the sequence "arcnet_close(); arcnet_open();" is - not complete. Some children arcnet drivers have special init/exit - code sequences, which then embed a call to arcnet_open() and - arcnet_close() accordingly. Check drivers/net/arcnet/com20020.c. - -Run the device RESET sequence from a scheduled workqueue instead. - -Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Link: https://lore.kernel.org/r/20210128194802.727770-1-a.darwish@linutronix.de -Signed-off-by: Jakub Kicinski <kuba@kernel.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - drivers/net/arcnet/arc-rimi.c | 4 +- - drivers/net/arcnet/arcdevice.h | 6 +++ - drivers/net/arcnet/arcnet.c | 66 +++++++++++++++++++++++++++++-- - drivers/net/arcnet/com20020-isa.c | 4 +- - drivers/net/arcnet/com20020-pci.c | 2 +- - drivers/net/arcnet/com20020_cs.c | 2 +- - drivers/net/arcnet/com90io.c | 4 +- - drivers/net/arcnet/com90xx.c | 4 +- - 8 files changed, 78 insertions(+), 14 deletions(-) - -diff --git a/drivers/net/arcnet/arc-rimi.c b/drivers/net/arcnet/arc-rimi.c -index 98df38fe553c..12d085405bd0 100644 ---- a/drivers/net/arcnet/arc-rimi.c -+++ b/drivers/net/arcnet/arc-rimi.c -@@ -332,7 +332,7 @@ static int __init arc_rimi_init(void) - dev->irq = 9; - - if (arcrimi_probe(dev)) { -- free_netdev(dev); -+ free_arcdev(dev); - return -EIO; - } - -@@ -349,7 +349,7 @@ static void __exit arc_rimi_exit(void) - iounmap(lp->mem_start); - release_mem_region(dev->mem_start, dev->mem_end - dev->mem_start + 1); - free_irq(dev->irq, dev); -- free_netdev(dev); -+ free_arcdev(dev); - } - - #ifndef MODULE -diff --git a/drivers/net/arcnet/arcdevice.h b/drivers/net/arcnet/arcdevice.h -index 22a49c6d7ae6..5d4a4c7efbbf 100644 ---- a/drivers/net/arcnet/arcdevice.h -+++ b/drivers/net/arcnet/arcdevice.h -@@ -298,6 +298,10 @@ struct arcnet_local { - - int excnak_pending; /* We just got an excesive nak interrupt */ - -+ /* RESET flag handling */ -+ int reset_in_progress; -+ struct work_struct reset_work; -+ - struct { - uint16_t sequence; /* sequence number (incs with each packet) */ - __be16 aborted_seq; -@@ -350,7 +354,9 @@ void arcnet_dump_skb(struct net_device *dev, struct sk_buff *skb, char *desc) - - void arcnet_unregister_proto(struct ArcProto *proto); - irqreturn_t arcnet_interrupt(int irq, void *dev_id); -+ - struct net_device *alloc_arcdev(const char *name); -+void free_arcdev(struct net_device *dev); - - int arcnet_open(struct net_device *dev); - int arcnet_close(struct net_device *dev); -diff --git a/drivers/net/arcnet/arcnet.c b/drivers/net/arcnet/arcnet.c -index e04efc0a5c97..d76dd7d14299 100644 ---- a/drivers/net/arcnet/arcnet.c -+++ b/drivers/net/arcnet/arcnet.c -@@ -387,10 +387,44 @@ static void arcnet_timer(struct timer_list *t) - struct arcnet_local *lp = from_timer(lp, t, timer); - struct net_device *dev = lp->dev; - -- if (!netif_carrier_ok(dev)) { -+ spin_lock_irq(&lp->lock); -+ -+ if (!lp->reset_in_progress && !netif_carrier_ok(dev)) { - netif_carrier_on(dev); - netdev_info(dev, "link up\n"); - } -+ -+ spin_unlock_irq(&lp->lock); -+} -+ -+static void reset_device_work(struct work_struct *work) -+{ -+ struct arcnet_local *lp; -+ struct net_device *dev; -+ -+ lp = container_of(work, struct arcnet_local, reset_work); -+ dev = lp->dev; -+ -+ /* Do not bring the network interface back up if an ifdown -+ * was already done. -+ */ -+ if (!netif_running(dev) || !lp->reset_in_progress) -+ return; -+ -+ rtnl_lock(); -+ -+ /* Do another check, in case of an ifdown that was triggered in -+ * the small race window between the exit condition above and -+ * acquiring RTNL. -+ */ -+ if (!netif_running(dev) || !lp->reset_in_progress) -+ goto out; -+ -+ dev_close(dev); -+ dev_open(dev, NULL); -+ -+out: -+ rtnl_unlock(); - } - - static void arcnet_reply_tasklet(unsigned long data) -@@ -452,12 +486,25 @@ struct net_device *alloc_arcdev(const char *name) - lp->dev = dev; - spin_lock_init(&lp->lock); - timer_setup(&lp->timer, arcnet_timer, 0); -+ INIT_WORK(&lp->reset_work, reset_device_work); - } - - return dev; - } - EXPORT_SYMBOL(alloc_arcdev); - -+void free_arcdev(struct net_device *dev) -+{ -+ struct arcnet_local *lp = netdev_priv(dev); -+ -+ /* Do not cancel this at ->ndo_close(), as the workqueue itself -+ * indirectly calls the ifdown path through dev_close(). -+ */ -+ cancel_work_sync(&lp->reset_work); -+ free_netdev(dev); -+} -+EXPORT_SYMBOL(free_arcdev); -+ - /* Open/initialize the board. This is called sometime after booting when - * the 'ifconfig' program is run. - * -@@ -587,6 +634,10 @@ int arcnet_close(struct net_device *dev) - - /* shut down the card */ - lp->hw.close(dev); -+ -+ /* reset counters */ -+ lp->reset_in_progress = 0; -+ - module_put(lp->hw.owner); - return 0; - } -@@ -820,6 +871,9 @@ irqreturn_t arcnet_interrupt(int irq, void *dev_id) - - spin_lock_irqsave(&lp->lock, flags); - -+ if (lp->reset_in_progress) -+ goto out; -+ - /* RESET flag was enabled - if device is not running, we must - * clear it right away (but nothing else). - */ -@@ -852,11 +906,14 @@ irqreturn_t arcnet_interrupt(int irq, void *dev_id) - if (status & RESETflag) { - arc_printk(D_NORMAL, dev, "spurious reset (status=%Xh)\n", - status); -- arcnet_close(dev); -- arcnet_open(dev); -+ -+ lp->reset_in_progress = 1; -+ netif_stop_queue(dev); -+ netif_carrier_off(dev); -+ schedule_work(&lp->reset_work); - - /* get out of the interrupt handler! */ -- break; -+ goto out; - } - /* RX is inhibited - we must have received something. - * Prepare to receive into the next buffer. -@@ -1052,6 +1109,7 @@ irqreturn_t arcnet_interrupt(int irq, void *dev_id) - udelay(1); - lp->hw.intmask(dev, lp->intmask); - -+out: - spin_unlock_irqrestore(&lp->lock, flags); - return retval; - } -diff --git a/drivers/net/arcnet/com20020-isa.c b/drivers/net/arcnet/com20020-isa.c -index f983c4ce6b07..be618e4b9ed5 100644 ---- a/drivers/net/arcnet/com20020-isa.c -+++ b/drivers/net/arcnet/com20020-isa.c -@@ -169,7 +169,7 @@ static int __init com20020_init(void) - dev->irq = 9; - - if (com20020isa_probe(dev)) { -- free_netdev(dev); -+ free_arcdev(dev); - return -EIO; - } - -@@ -182,7 +182,7 @@ static void __exit com20020_exit(void) - unregister_netdev(my_dev); - free_irq(my_dev->irq, my_dev); - release_region(my_dev->base_addr, ARCNET_TOTAL_SIZE); -- free_netdev(my_dev); -+ free_arcdev(my_dev); - } - - #ifndef MODULE -diff --git a/drivers/net/arcnet/com20020-pci.c b/drivers/net/arcnet/com20020-pci.c -index eb7f76753c9c..8bdc44b7e09a 100644 ---- a/drivers/net/arcnet/com20020-pci.c -+++ b/drivers/net/arcnet/com20020-pci.c -@@ -291,7 +291,7 @@ static void com20020pci_remove(struct pci_dev *pdev) - - unregister_netdev(dev); - free_irq(dev->irq, dev); -- free_netdev(dev); -+ free_arcdev(dev); - } - } - -diff --git a/drivers/net/arcnet/com20020_cs.c b/drivers/net/arcnet/com20020_cs.c -index cf607ffcf358..9cc5eb6a8e90 100644 ---- a/drivers/net/arcnet/com20020_cs.c -+++ b/drivers/net/arcnet/com20020_cs.c -@@ -177,7 +177,7 @@ static void com20020_detach(struct pcmcia_device *link) - dev = info->dev; - if (dev) { - dev_dbg(&link->dev, "kfree...\n"); -- free_netdev(dev); -+ free_arcdev(dev); - } - dev_dbg(&link->dev, "kfree2...\n"); - kfree(info); -diff --git a/drivers/net/arcnet/com90io.c b/drivers/net/arcnet/com90io.c -index cf214b730671..3856b447d38e 100644 ---- a/drivers/net/arcnet/com90io.c -+++ b/drivers/net/arcnet/com90io.c -@@ -396,7 +396,7 @@ static int __init com90io_init(void) - err = com90io_probe(dev); - - if (err) { -- free_netdev(dev); -+ free_arcdev(dev); - return err; - } - -@@ -419,7 +419,7 @@ static void __exit com90io_exit(void) - - free_irq(dev->irq, dev); - release_region(dev->base_addr, ARCNET_TOTAL_SIZE); -- free_netdev(dev); -+ free_arcdev(dev); - } - - module_init(com90io_init) -diff --git a/drivers/net/arcnet/com90xx.c b/drivers/net/arcnet/com90xx.c -index 3dc3d533cb19..d8dfb9ea0de8 100644 ---- a/drivers/net/arcnet/com90xx.c -+++ b/drivers/net/arcnet/com90xx.c -@@ -554,7 +554,7 @@ static int __init com90xx_found(int ioaddr, int airq, u_long shmem, - err_release_mem: - release_mem_region(dev->mem_start, dev->mem_end - dev->mem_start + 1); - err_free_dev: -- free_netdev(dev); -+ free_arcdev(dev); - return -EIO; - } - -@@ -672,7 +672,7 @@ static void __exit com90xx_exit(void) - release_region(dev->base_addr, ARCNET_TOTAL_SIZE); - release_mem_region(dev->mem_start, - dev->mem_end - dev->mem_start + 1); -- free_netdev(dev); -+ free_arcdev(dev); - } - } - --- -2.30.2 - diff --git a/debian/patches-rt/0135-tasklets-Replace-barrier-with-cpu_relax-in-tasklet_u.patch b/debian/patches-rt/0135-tasklets-Replace-barrier-with-cpu_relax-in-tasklet_u.patch deleted file mode 100644 index 3bafc0951..000000000 --- a/debian/patches-rt/0135-tasklets-Replace-barrier-with-cpu_relax-in-tasklet_u.patch +++ /dev/null @@ -1,35 +0,0 @@ -From 77f6ca2a81466fe36e8549371d59e935b51bb828 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 9 Mar 2021 09:42:04 +0100 -Subject: [PATCH 135/296] tasklets: Replace barrier() with cpu_relax() in - tasklet_unlock_wait() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -A barrier() in a tight loop which waits for something to happen on a remote -CPU is a pointless exercise. Replace it with cpu_relax() which allows HT -siblings to make progress. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/interrupt.h | 3 ++- - 1 file changed, 2 insertions(+), 1 deletion(-) - -diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h -index ee8299eb1f52..10248a726373 100644 ---- a/include/linux/interrupt.h -+++ b/include/linux/interrupt.h -@@ -668,7 +668,8 @@ static inline void tasklet_unlock(struct tasklet_struct *t) - - static inline void tasklet_unlock_wait(struct tasklet_struct *t) - { -- while (test_bit(TASKLET_STATE_RUN, &(t)->state)) { barrier(); } -+ while (test_bit(TASKLET_STATE_RUN, &t->state)) -+ cpu_relax(); - } - #else - #define tasklet_trylock(t) 1 --- -2.30.2 - diff --git a/debian/patches-rt/0136-tasklets-Use-static-inlines-for-stub-implementations.patch b/debian/patches-rt/0136-tasklets-Use-static-inlines-for-stub-implementations.patch deleted file mode 100644 index be0e5a9d7..000000000 --- a/debian/patches-rt/0136-tasklets-Use-static-inlines-for-stub-implementations.patch +++ /dev/null @@ -1,35 +0,0 @@ -From 0d4ead281ddc5c1f9ddbf2431eaacfb4f52bf587 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 9 Mar 2021 09:42:05 +0100 -Subject: [PATCH 136/296] tasklets: Use static inlines for stub implementations -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Inlines exist for a reason. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/interrupt.h | 6 +++--- - 1 file changed, 3 insertions(+), 3 deletions(-) - -diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h -index 10248a726373..0c319b7e929a 100644 ---- a/include/linux/interrupt.h -+++ b/include/linux/interrupt.h -@@ -672,9 +672,9 @@ static inline void tasklet_unlock_wait(struct tasklet_struct *t) - cpu_relax(); - } - #else --#define tasklet_trylock(t) 1 --#define tasklet_unlock_wait(t) do { } while (0) --#define tasklet_unlock(t) do { } while (0) -+static inline int tasklet_trylock(struct tasklet_struct *t) { return 1; } -+static inline void tasklet_unlock(struct tasklet_struct *t) { } -+static inline void tasklet_unlock_wait(struct tasklet_struct *t) { } - #endif - - extern void __tasklet_schedule(struct tasklet_struct *t); --- -2.30.2 - diff --git a/debian/patches-rt/0137-tasklets-Provide-tasklet_disable_in_atomic.patch b/debian/patches-rt/0137-tasklets-Provide-tasklet_disable_in_atomic.patch deleted file mode 100644 index 90cf5ca32..000000000 --- a/debian/patches-rt/0137-tasklets-Provide-tasklet_disable_in_atomic.patch +++ /dev/null @@ -1,68 +0,0 @@ -From 3ec93684bd81a0ca6e6761322582bdbaa6530845 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 9 Mar 2021 09:42:06 +0100 -Subject: [PATCH 137/296] tasklets: Provide tasklet_disable_in_atomic() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Replacing the spin wait loops in tasklet_unlock_wait() with -wait_var_event() is not possible as a handful of tasklet_disable() -invocations are happening in atomic context. All other invocations are in -teardown paths which can sleep. - -Provide tasklet_disable_in_atomic() and tasklet_unlock_spin_wait() to -convert the few atomic use cases over, which allows to change -tasklet_disable() and tasklet_unlock_wait() in a later step. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/interrupt.h | 22 ++++++++++++++++++++++ - 1 file changed, 22 insertions(+) - -diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h -index 0c319b7e929a..9a9dd3ed6fdc 100644 ---- a/include/linux/interrupt.h -+++ b/include/linux/interrupt.h -@@ -671,10 +671,21 @@ static inline void tasklet_unlock_wait(struct tasklet_struct *t) - while (test_bit(TASKLET_STATE_RUN, &t->state)) - cpu_relax(); - } -+ -+/* -+ * Do not use in new code. Waiting for tasklets from atomic contexts is -+ * error prone and should be avoided. -+ */ -+static inline void tasklet_unlock_spin_wait(struct tasklet_struct *t) -+{ -+ while (test_bit(TASKLET_STATE_RUN, &t->state)) -+ cpu_relax(); -+} - #else - static inline int tasklet_trylock(struct tasklet_struct *t) { return 1; } - static inline void tasklet_unlock(struct tasklet_struct *t) { } - static inline void tasklet_unlock_wait(struct tasklet_struct *t) { } -+static inline void tasklet_unlock_spin_wait(struct tasklet_struct *t) { } - #endif - - extern void __tasklet_schedule(struct tasklet_struct *t); -@@ -699,6 +710,17 @@ static inline void tasklet_disable_nosync(struct tasklet_struct *t) - smp_mb__after_atomic(); - } - -+/* -+ * Do not use in new code. Disabling tasklets from atomic contexts is -+ * error prone and should be avoided. -+ */ -+static inline void tasklet_disable_in_atomic(struct tasklet_struct *t) -+{ -+ tasklet_disable_nosync(t); -+ tasklet_unlock_spin_wait(t); -+ smp_mb(); -+} -+ - static inline void tasklet_disable(struct tasklet_struct *t) - { - tasklet_disable_nosync(t); --- -2.30.2 - diff --git a/debian/patches-rt/0138-tasklets-Use-spin-wait-in-tasklet_disable-temporaril.patch b/debian/patches-rt/0138-tasklets-Use-spin-wait-in-tasklet_disable-temporaril.patch deleted file mode 100644 index bd6c5d4e8..000000000 --- a/debian/patches-rt/0138-tasklets-Use-spin-wait-in-tasklet_disable-temporaril.patch +++ /dev/null @@ -1,33 +0,0 @@ -From e559073398a9dca28ca98d946ec23433b797ae77 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 9 Mar 2021 09:42:07 +0100 -Subject: [PATCH 138/296] tasklets: Use spin wait in tasklet_disable() - temporarily -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -To ease the transition use spin waiting in tasklet_disable() until all -usage sites from atomic context have been cleaned up. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/interrupt.h | 3 ++- - 1 file changed, 2 insertions(+), 1 deletion(-) - -diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h -index 9a9dd3ed6fdc..77bbb7e4e0f9 100644 ---- a/include/linux/interrupt.h -+++ b/include/linux/interrupt.h -@@ -724,7 +724,8 @@ static inline void tasklet_disable_in_atomic(struct tasklet_struct *t) - static inline void tasklet_disable(struct tasklet_struct *t) - { - tasklet_disable_nosync(t); -- tasklet_unlock_wait(t); -+ /* Spin wait until all atomic users are converted */ -+ tasklet_unlock_spin_wait(t); - smp_mb(); - } - --- -2.30.2 - diff --git a/debian/patches-rt/0139-tasklets-Replace-spin-wait-in-tasklet_unlock_wait.patch b/debian/patches-rt/0139-tasklets-Replace-spin-wait-in-tasklet_unlock_wait.patch deleted file mode 100644 index d06d73608..000000000 --- a/debian/patches-rt/0139-tasklets-Replace-spin-wait-in-tasklet_unlock_wait.patch +++ /dev/null @@ -1,90 +0,0 @@ -From a5330880ad2acaa63e2f4c4ecd27bdae9e6befe8 Mon Sep 17 00:00:00 2001 -From: Peter Zijlstra <peterz@infradead.org> -Date: Tue, 9 Mar 2021 09:42:08 +0100 -Subject: [PATCH 139/296] tasklets: Replace spin wait in tasklet_unlock_wait() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -tasklet_unlock_wait() spin waits for TASKLET_STATE_RUN to be cleared. This -is wasting CPU cycles in a tight loop which is especially painful in a -guest when the CPU running the tasklet is scheduled out. - -tasklet_unlock_wait() is invoked from tasklet_kill() which is used in -teardown paths and not performance critical at all. Replace the spin wait -with wait_var_event(). - -There are no users of tasklet_unlock_wait() which are invoked from atomic -contexts. The usage in tasklet_disable() has been replaced temporarily with -the spin waiting variant until the atomic users are fixed up and will be -converted to the sleep wait variant later. - -Signed-off-by: Peter Zijlstra <peterz@infradead.org> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/interrupt.h | 13 ++----------- - kernel/softirq.c | 18 ++++++++++++++++++ - 2 files changed, 20 insertions(+), 11 deletions(-) - -diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h -index 77bbb7e4e0f9..77cd88cc4575 100644 ---- a/include/linux/interrupt.h -+++ b/include/linux/interrupt.h -@@ -660,17 +660,8 @@ static inline int tasklet_trylock(struct tasklet_struct *t) - return !test_and_set_bit(TASKLET_STATE_RUN, &(t)->state); - } - --static inline void tasklet_unlock(struct tasklet_struct *t) --{ -- smp_mb__before_atomic(); -- clear_bit(TASKLET_STATE_RUN, &(t)->state); --} -- --static inline void tasklet_unlock_wait(struct tasklet_struct *t) --{ -- while (test_bit(TASKLET_STATE_RUN, &t->state)) -- cpu_relax(); --} -+void tasklet_unlock(struct tasklet_struct *t); -+void tasklet_unlock_wait(struct tasklet_struct *t); - - /* - * Do not use in new code. Waiting for tasklets from atomic contexts is -diff --git a/kernel/softirq.c b/kernel/softirq.c -index d5bfd5e661fc..06bca024ce45 100644 ---- a/kernel/softirq.c -+++ b/kernel/softirq.c -@@ -25,6 +25,7 @@ - #include <linux/smpboot.h> - #include <linux/tick.h> - #include <linux/irq.h> -+#include <linux/wait_bit.h> - - #define CREATE_TRACE_POINTS - #include <trace/events/irq.h> -@@ -619,6 +620,23 @@ void tasklet_kill(struct tasklet_struct *t) - } - EXPORT_SYMBOL(tasklet_kill); - -+#ifdef CONFIG_SMP -+void tasklet_unlock(struct tasklet_struct *t) -+{ -+ smp_mb__before_atomic(); -+ clear_bit(TASKLET_STATE_RUN, &t->state); -+ smp_mb__after_atomic(); -+ wake_up_var(&t->state); -+} -+EXPORT_SYMBOL_GPL(tasklet_unlock); -+ -+void tasklet_unlock_wait(struct tasklet_struct *t) -+{ -+ wait_var_event(&t->state, !test_bit(TASKLET_STATE_RUN, &t->state)); -+} -+EXPORT_SYMBOL_GPL(tasklet_unlock_wait); -+#endif -+ - void __init softirq_init(void) - { - int cpu; --- -2.30.2 - diff --git a/debian/patches-rt/0140-tasklets-Replace-spin-wait-in-tasklet_kill.patch b/debian/patches-rt/0140-tasklets-Replace-spin-wait-in-tasklet_kill.patch deleted file mode 100644 index af94b46f2..000000000 --- a/debian/patches-rt/0140-tasklets-Replace-spin-wait-in-tasklet_kill.patch +++ /dev/null @@ -1,74 +0,0 @@ -From 47a26233fa5f9be9110a0d0173523f6ce902d417 Mon Sep 17 00:00:00 2001 -From: Peter Zijlstra <peterz@infradead.org> -Date: Tue, 9 Mar 2021 09:42:09 +0100 -Subject: [PATCH 140/296] tasklets: Replace spin wait in tasklet_kill() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -tasklet_kill() spin waits for TASKLET_STATE_SCHED to be cleared invoking -yield() from inside the loop. yield() is an ill defined mechanism and the -result might still be wasting CPU cycles in a tight loop which is -especially painful in a guest when the CPU running the tasklet is scheduled -out. - -tasklet_kill() is used in teardown paths and not performance critical at -all. Replace the spin wait with wait_var_event(). - -Signed-off-by: Peter Zijlstra <peterz@infradead.org> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/softirq.c | 23 +++++++++++++++-------- - 1 file changed, 15 insertions(+), 8 deletions(-) - -diff --git a/kernel/softirq.c b/kernel/softirq.c -index 06bca024ce45..ecc3ac4091c8 100644 ---- a/kernel/softirq.c -+++ b/kernel/softirq.c -@@ -530,6 +530,16 @@ void __tasklet_hi_schedule(struct tasklet_struct *t) - } - EXPORT_SYMBOL(__tasklet_hi_schedule); - -+static inline bool tasklet_clear_sched(struct tasklet_struct *t) -+{ -+ if (test_and_clear_bit(TASKLET_STATE_SCHED, &t->state)) { -+ wake_up_var(&t->state); -+ return true; -+ } -+ -+ return false; -+} -+ - static void tasklet_action_common(struct softirq_action *a, - struct tasklet_head *tl_head, - unsigned int softirq_nr) -@@ -549,8 +559,7 @@ static void tasklet_action_common(struct softirq_action *a, - - if (tasklet_trylock(t)) { - if (!atomic_read(&t->count)) { -- if (!test_and_clear_bit(TASKLET_STATE_SCHED, -- &t->state)) -+ if (!tasklet_clear_sched(t)) - BUG(); - if (t->use_callback) - t->callback(t); -@@ -610,13 +619,11 @@ void tasklet_kill(struct tasklet_struct *t) - if (in_interrupt()) - pr_notice("Attempt to kill tasklet from interrupt\n"); - -- while (test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) { -- do { -- yield(); -- } while (test_bit(TASKLET_STATE_SCHED, &t->state)); -- } -+ while (test_and_set_bit(TASKLET_STATE_SCHED, &t->state)) -+ wait_var_event(&t->state, !test_bit(TASKLET_STATE_SCHED, &t->state)); -+ - tasklet_unlock_wait(t); -- clear_bit(TASKLET_STATE_SCHED, &t->state); -+ tasklet_clear_sched(t); - } - EXPORT_SYMBOL(tasklet_kill); - --- -2.30.2 - diff --git a/debian/patches-rt/0141-tasklets-Prevent-tasklet_unlock_spin_wait-deadlock-o.patch b/debian/patches-rt/0141-tasklets-Prevent-tasklet_unlock_spin_wait-deadlock-o.patch deleted file mode 100644 index ffe2da8dc..000000000 --- a/debian/patches-rt/0141-tasklets-Prevent-tasklet_unlock_spin_wait-deadlock-o.patch +++ /dev/null @@ -1,109 +0,0 @@ -From 3309e5b7ed86e4afc749314967e28e8f6b3c5fe1 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 9 Mar 2021 09:42:10 +0100 -Subject: [PATCH 141/296] tasklets: Prevent tasklet_unlock_spin_wait() deadlock - on RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -tasklet_unlock_spin_wait() spin waits for the TASKLET_STATE_SCHED bit in -the tasklet state to be cleared. This works on !RT nicely because the -corresponding execution can only happen on a different CPU. - -On RT softirq processing is preemptible, therefore a task preempting the -softirq processing thread can spin forever. - -Prevent this by invoking local_bh_disable()/enable() inside the loop. In -case that the softirq processing thread was preempted by the current task, -current will block on the local lock which yields the CPU to the preempted -softirq processing thread. If the tasklet is processed on a different CPU -then the local_bh_disable()/enable() pair is just a waste of processor -cycles. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/interrupt.h | 12 ++---------- - kernel/softirq.c | 28 +++++++++++++++++++++++++++- - 2 files changed, 29 insertions(+), 11 deletions(-) - -diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h -index 77cd88cc4575..4b3ec31189db 100644 ---- a/include/linux/interrupt.h -+++ b/include/linux/interrupt.h -@@ -654,7 +654,7 @@ enum - TASKLET_STATE_RUN /* Tasklet is running (SMP only) */ - }; - --#ifdef CONFIG_SMP -+#if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT) - static inline int tasklet_trylock(struct tasklet_struct *t) - { - return !test_and_set_bit(TASKLET_STATE_RUN, &(t)->state); -@@ -662,16 +662,8 @@ static inline int tasklet_trylock(struct tasklet_struct *t) - - void tasklet_unlock(struct tasklet_struct *t); - void tasklet_unlock_wait(struct tasklet_struct *t); -+void tasklet_unlock_spin_wait(struct tasklet_struct *t); - --/* -- * Do not use in new code. Waiting for tasklets from atomic contexts is -- * error prone and should be avoided. -- */ --static inline void tasklet_unlock_spin_wait(struct tasklet_struct *t) --{ -- while (test_bit(TASKLET_STATE_RUN, &t->state)) -- cpu_relax(); --} - #else - static inline int tasklet_trylock(struct tasklet_struct *t) { return 1; } - static inline void tasklet_unlock(struct tasklet_struct *t) { } -diff --git a/kernel/softirq.c b/kernel/softirq.c -index ecc3ac4091c8..fcb201ceed71 100644 ---- a/kernel/softirq.c -+++ b/kernel/softirq.c -@@ -614,6 +614,32 @@ void tasklet_init(struct tasklet_struct *t, - } - EXPORT_SYMBOL(tasklet_init); - -+#if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT) -+/* -+ * Do not use in new code. Waiting for tasklets from atomic contexts is -+ * error prone and should be avoided. -+ */ -+void tasklet_unlock_spin_wait(struct tasklet_struct *t) -+{ -+ while (test_bit(TASKLET_STATE_RUN, &(t)->state)) { -+ if (IS_ENABLED(CONFIG_PREEMPT_RT)) { -+ /* -+ * Prevent a live lock when current preempted soft -+ * interrupt processing or prevents ksoftirqd from -+ * running. If the tasklet runs on a different CPU -+ * then this has no effect other than doing the BH -+ * disable/enable dance for nothing. -+ */ -+ local_bh_disable(); -+ local_bh_enable(); -+ } else { -+ cpu_relax(); -+ } -+ } -+} -+EXPORT_SYMBOL(tasklet_unlock_spin_wait); -+#endif -+ - void tasklet_kill(struct tasklet_struct *t) - { - if (in_interrupt()) -@@ -627,7 +653,7 @@ void tasklet_kill(struct tasklet_struct *t) - } - EXPORT_SYMBOL(tasklet_kill); - --#ifdef CONFIG_SMP -+#if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT) - void tasklet_unlock(struct tasklet_struct *t) - { - smp_mb__before_atomic(); --- -2.30.2 - diff --git a/debian/patches-rt/0142-net-jme-Replace-link-change-tasklet-with-work.patch b/debian/patches-rt/0142-net-jme-Replace-link-change-tasklet-with-work.patch deleted file mode 100644 index f40531bf9..000000000 --- a/debian/patches-rt/0142-net-jme-Replace-link-change-tasklet-with-work.patch +++ /dev/null @@ -1,88 +0,0 @@ -From f7eb13c31d8d3e6a849b2aea91bee8b0ecd0b94e Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Tue, 9 Mar 2021 09:42:11 +0100 -Subject: [PATCH 142/296] net: jme: Replace link-change tasklet with work -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The link change tasklet disables the tasklets for tx/rx processing while -upating hw parameters and then enables the tasklets again. - -This update can also be pushed into a workqueue where it can be performed -in preemptible context. This allows tasklet_disable() to become sleeping. - -Replace the linkch_task tasklet with a work. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - drivers/net/ethernet/jme.c | 10 +++++----- - drivers/net/ethernet/jme.h | 2 +- - 2 files changed, 6 insertions(+), 6 deletions(-) - -diff --git a/drivers/net/ethernet/jme.c b/drivers/net/ethernet/jme.c -index e9efe074edc1..f1b9284e0bea 100644 ---- a/drivers/net/ethernet/jme.c -+++ b/drivers/net/ethernet/jme.c -@@ -1265,9 +1265,9 @@ jme_stop_shutdown_timer(struct jme_adapter *jme) - jwrite32f(jme, JME_APMC, apmc); - } - --static void jme_link_change_tasklet(struct tasklet_struct *t) -+static void jme_link_change_work(struct work_struct *work) - { -- struct jme_adapter *jme = from_tasklet(jme, t, linkch_task); -+ struct jme_adapter *jme = container_of(work, struct jme_adapter, linkch_task); - struct net_device *netdev = jme->dev; - int rc; - -@@ -1510,7 +1510,7 @@ jme_intr_msi(struct jme_adapter *jme, u32 intrstat) - * all other events are ignored - */ - jwrite32(jme, JME_IEVE, intrstat); -- tasklet_schedule(&jme->linkch_task); -+ schedule_work(&jme->linkch_task); - goto out_reenable; - } - -@@ -1832,7 +1832,6 @@ jme_open(struct net_device *netdev) - jme_clear_pm_disable_wol(jme); - JME_NAPI_ENABLE(jme); - -- tasklet_setup(&jme->linkch_task, jme_link_change_tasklet); - tasklet_setup(&jme->txclean_task, jme_tx_clean_tasklet); - tasklet_setup(&jme->rxclean_task, jme_rx_clean_tasklet); - tasklet_setup(&jme->rxempty_task, jme_rx_empty_tasklet); -@@ -1920,7 +1919,7 @@ jme_close(struct net_device *netdev) - - JME_NAPI_DISABLE(jme); - -- tasklet_kill(&jme->linkch_task); -+ cancel_work_sync(&jme->linkch_task); - tasklet_kill(&jme->txclean_task); - tasklet_kill(&jme->rxclean_task); - tasklet_kill(&jme->rxempty_task); -@@ -3035,6 +3034,7 @@ jme_init_one(struct pci_dev *pdev, - atomic_set(&jme->rx_empty, 1); - - tasklet_setup(&jme->pcc_task, jme_pcc_tasklet); -+ INIT_WORK(&jme->linkch_task, jme_link_change_work); - jme->dpi.cur = PCC_P1; - - jme->reg_ghc = 0; -diff --git a/drivers/net/ethernet/jme.h b/drivers/net/ethernet/jme.h -index a2c3b00d939d..2af76329b4a2 100644 ---- a/drivers/net/ethernet/jme.h -+++ b/drivers/net/ethernet/jme.h -@@ -411,7 +411,7 @@ struct jme_adapter { - struct tasklet_struct rxempty_task; - struct tasklet_struct rxclean_task; - struct tasklet_struct txclean_task; -- struct tasklet_struct linkch_task; -+ struct work_struct linkch_task; - struct tasklet_struct pcc_task; - unsigned long flags; - u32 reg_txcs; --- -2.30.2 - diff --git a/debian/patches-rt/0143-net-sundance-Use-tasklet_disable_in_atomic.patch b/debian/patches-rt/0143-net-sundance-Use-tasklet_disable_in_atomic.patch deleted file mode 100644 index cf1226c73..000000000 --- a/debian/patches-rt/0143-net-sundance-Use-tasklet_disable_in_atomic.patch +++ /dev/null @@ -1,39 +0,0 @@ -From 8c700c501dd817dc3e45e8091c74ac5b4926faa6 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Tue, 9 Mar 2021 09:42:12 +0100 -Subject: [PATCH 143/296] net: sundance: Use tasklet_disable_in_atomic(). -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -tasklet_disable() is used in the timer callback. This might be distangled, -but without access to the hardware that's a bit risky. - -Replace it with tasklet_disable_in_atomic() so tasklet_disable() can be -changed to a sleep wait once all remaining atomic users are converted. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Cc: Denis Kirjanov <kda@linux-powerpc.org> -Cc: "David S. Miller" <davem@davemloft.net> -Cc: Jakub Kicinski <kuba@kernel.org> -Cc: netdev@vger.kernel.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - drivers/net/ethernet/dlink/sundance.c | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/drivers/net/ethernet/dlink/sundance.c b/drivers/net/ethernet/dlink/sundance.c -index e3a8858915b3..df0eab479d51 100644 ---- a/drivers/net/ethernet/dlink/sundance.c -+++ b/drivers/net/ethernet/dlink/sundance.c -@@ -963,7 +963,7 @@ static void tx_timeout(struct net_device *dev, unsigned int txqueue) - unsigned long flag; - - netif_stop_queue(dev); -- tasklet_disable(&np->tx_tasklet); -+ tasklet_disable_in_atomic(&np->tx_tasklet); - iowrite16(0, ioaddr + IntrEnable); - printk(KERN_WARNING "%s: Transmit timed out, TxStatus %2.2x " - "TxFrameId %2.2x," --- -2.30.2 - diff --git a/debian/patches-rt/0144-ath9k-Use-tasklet_disable_in_atomic.patch b/debian/patches-rt/0144-ath9k-Use-tasklet_disable_in_atomic.patch deleted file mode 100644 index 0fd93d845..000000000 --- a/debian/patches-rt/0144-ath9k-Use-tasklet_disable_in_atomic.patch +++ /dev/null @@ -1,48 +0,0 @@ -From 85a3ab608625ac41e3fc43b838b18b66b789cccd Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Tue, 9 Mar 2021 09:42:13 +0100 -Subject: [PATCH 144/296] ath9k: Use tasklet_disable_in_atomic() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -All callers of ath9k_beacon_ensure_primary_slot() are preemptible / -acquire a mutex except for this callchain: - - spin_lock_bh(&sc->sc_pcu_lock); - ath_complete_reset() - -> ath9k_calculate_summary_state() - -> ath9k_beacon_ensure_primary_slot() - -It's unclear how that can be distangled, so use tasklet_disable_in_atomic() -for now. This allows tasklet_disable() to become sleepable once the -remaining atomic users are cleaned up. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Cc: ath9k-devel@qca.qualcomm.com -Cc: Kalle Valo <kvalo@codeaurora.org> -Cc: "David S. Miller" <davem@davemloft.net> -Cc: Jakub Kicinski <kuba@kernel.org> -Cc: linux-wireless@vger.kernel.org -Cc: netdev@vger.kernel.org -Acked-by: Kalle Valo <kvalo@codeaurora.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - drivers/net/wireless/ath/ath9k/beacon.c | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/drivers/net/wireless/ath/ath9k/beacon.c b/drivers/net/wireless/ath/ath9k/beacon.c -index 71e2ada86793..72e2e71aac0e 100644 ---- a/drivers/net/wireless/ath/ath9k/beacon.c -+++ b/drivers/net/wireless/ath/ath9k/beacon.c -@@ -251,7 +251,7 @@ void ath9k_beacon_ensure_primary_slot(struct ath_softc *sc) - int first_slot = ATH_BCBUF; - int slot; - -- tasklet_disable(&sc->bcon_tasklet); -+ tasklet_disable_in_atomic(&sc->bcon_tasklet); - - /* Find first taken slot. */ - for (slot = 0; slot < ATH_BCBUF; slot++) { --- -2.30.2 - diff --git a/debian/patches-rt/0145-atm-eni-Use-tasklet_disable_in_atomic-in-the-send-ca.patch b/debian/patches-rt/0145-atm-eni-Use-tasklet_disable_in_atomic-in-the-send-ca.patch deleted file mode 100644 index 798384e1a..000000000 --- a/debian/patches-rt/0145-atm-eni-Use-tasklet_disable_in_atomic-in-the-send-ca.patch +++ /dev/null @@ -1,42 +0,0 @@ -From 8d6d3fe4fcbaf05360d73100c2a390dc0f5da4b9 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Tue, 9 Mar 2021 09:42:14 +0100 -Subject: [PATCH 145/296] atm: eni: Use tasklet_disable_in_atomic() in the - send() callback -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The atmdev_ops::send callback which calls tasklet_disable() is invoked with -bottom halfs disabled from net_device_ops::ndo_start_xmit(). All other -invocations of tasklet_disable() in this driver happen in preemptible -context. - -Change the send() call to use tasklet_disable_in_atomic() which allows -tasklet_disable() to be made sleepable once the remaining atomic context -usage sites are cleaned up. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Cc: Chas Williams <3chas3@gmail.com> -Cc: linux-atm-general@lists.sourceforge.net -Cc: netdev@vger.kernel.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - drivers/atm/eni.c | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/drivers/atm/eni.c b/drivers/atm/eni.c -index b574cce98dc3..422753d52244 100644 ---- a/drivers/atm/eni.c -+++ b/drivers/atm/eni.c -@@ -2054,7 +2054,7 @@ static int eni_send(struct atm_vcc *vcc,struct sk_buff *skb) - } - submitted++; - ATM_SKB(skb)->vcc = vcc; -- tasklet_disable(&ENI_DEV(vcc->dev)->task); -+ tasklet_disable_in_atomic(&ENI_DEV(vcc->dev)->task); - res = do_tx(skb); - tasklet_enable(&ENI_DEV(vcc->dev)->task); - if (res == enq_ok) return 0; --- -2.30.2 - diff --git a/debian/patches-rt/0146-PCI-hv-Use-tasklet_disable_in_atomic.patch b/debian/patches-rt/0146-PCI-hv-Use-tasklet_disable_in_atomic.patch deleted file mode 100644 index 0983ffaad..000000000 --- a/debian/patches-rt/0146-PCI-hv-Use-tasklet_disable_in_atomic.patch +++ /dev/null @@ -1,46 +0,0 @@ -From 598e2b04d74c9033bd19ec0975658cc53e8edccc Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Tue, 9 Mar 2021 09:42:15 +0100 -Subject: [PATCH 146/296] PCI: hv: Use tasklet_disable_in_atomic() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The hv_compose_msi_msg() callback in irq_chip::irq_compose_msi_msg is -invoked via irq_chip_compose_msi_msg(), which itself is always invoked from -atomic contexts from the guts of the interrupt core code. - -There is no way to change this w/o rewriting the whole driver, so use -tasklet_disable_in_atomic() which allows to make tasklet_disable() -sleepable once the remaining atomic users are addressed. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Cc: "K. Y. Srinivasan" <kys@microsoft.com> -Cc: Haiyang Zhang <haiyangz@microsoft.com> -Cc: Stephen Hemminger <sthemmin@microsoft.com> -Cc: Wei Liu <wei.liu@kernel.org> -Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> -Cc: Rob Herring <robh@kernel.org> -Cc: Bjorn Helgaas <bhelgaas@google.com> -Cc: linux-hyperv@vger.kernel.org -Cc: linux-pci@vger.kernel.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - drivers/pci/controller/pci-hyperv.c | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c -index 03ed5cb1c4b2..7370cdc1abdb 100644 ---- a/drivers/pci/controller/pci-hyperv.c -+++ b/drivers/pci/controller/pci-hyperv.c -@@ -1458,7 +1458,7 @@ static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg) - * Prevents hv_pci_onchannelcallback() from running concurrently - * in the tasklet. - */ -- tasklet_disable(&channel->callback_event); -+ tasklet_disable_in_atomic(&channel->callback_event); - - /* - * Since this function is called with IRQ locks held, can't --- -2.30.2 - diff --git a/debian/patches-rt/0147-firewire-ohci-Use-tasklet_disable_in_atomic-where-re.patch b/debian/patches-rt/0147-firewire-ohci-Use-tasklet_disable_in_atomic-where-re.patch deleted file mode 100644 index b2965e102..000000000 --- a/debian/patches-rt/0147-firewire-ohci-Use-tasklet_disable_in_atomic-where-re.patch +++ /dev/null @@ -1,61 +0,0 @@ -From 4bbc363db82b66a1f6bcc0a66d55f60767cbd6aa Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Tue, 9 Mar 2021 09:42:16 +0100 -Subject: [PATCH 147/296] firewire: ohci: Use tasklet_disable_in_atomic() where - required -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -tasklet_disable() is invoked in several places. Some of them are in atomic -context which prevents a conversion of tasklet_disable() to a sleepable -function. - -The atomic callchains are: - - ar_context_tasklet() - ohci_cancel_packet() - tasklet_disable() - - ... - ohci_flush_iso_completions() - tasklet_disable() - -The invocation of tasklet_disable() from at_context_flush() is always in -preemptible context. - -Use tasklet_disable_in_atomic() for the two invocations in -ohci_cancel_packet() and ohci_flush_iso_completions(). - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Cc: Stefan Richter <stefanr@s5r6.in-berlin.de> -Cc: linux1394-devel@lists.sourceforge.net -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - drivers/firewire/ohci.c | 4 ++-- - 1 file changed, 2 insertions(+), 2 deletions(-) - -diff --git a/drivers/firewire/ohci.c b/drivers/firewire/ohci.c -index 9811c40956e5..17c9d825188b 100644 ---- a/drivers/firewire/ohci.c -+++ b/drivers/firewire/ohci.c -@@ -2545,7 +2545,7 @@ static int ohci_cancel_packet(struct fw_card *card, struct fw_packet *packet) - struct driver_data *driver_data = packet->driver_data; - int ret = -ENOENT; - -- tasklet_disable(&ctx->tasklet); -+ tasklet_disable_in_atomic(&ctx->tasklet); - - if (packet->ack != 0) - goto out; -@@ -3465,7 +3465,7 @@ static int ohci_flush_iso_completions(struct fw_iso_context *base) - struct iso_context *ctx = container_of(base, struct iso_context, base); - int ret = 0; - -- tasklet_disable(&ctx->context.tasklet); -+ tasklet_disable_in_atomic(&ctx->context.tasklet); - - if (!test_and_set_bit_lock(0, &ctx->flushing_completions)) { - context_tasklet((unsigned long)&ctx->context); --- -2.30.2 - diff --git a/debian/patches-rt/0148-tasklets-Switch-tasklet_disable-to-the-sleep-wait-va.patch b/debian/patches-rt/0148-tasklets-Switch-tasklet_disable-to-the-sleep-wait-va.patch deleted file mode 100644 index 9f11cdd21..000000000 --- a/debian/patches-rt/0148-tasklets-Switch-tasklet_disable-to-the-sleep-wait-va.patch +++ /dev/null @@ -1,35 +0,0 @@ -From 71206860ba46d511c6f9c6736dcba3d40a8f2952 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 9 Mar 2021 09:42:17 +0100 -Subject: [PATCH 148/296] tasklets: Switch tasklet_disable() to the sleep wait - variant -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - - -- NOT FOR IMMEDIATE MERGING -- - -Now that all users of tasklet_disable() are invoked from sleepable context, -convert it to use tasklet_unlock_wait() which might sleep. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/interrupt.h | 3 +-- - 1 file changed, 1 insertion(+), 2 deletions(-) - -diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h -index 4b3ec31189db..e58f9b0650d3 100644 ---- a/include/linux/interrupt.h -+++ b/include/linux/interrupt.h -@@ -707,8 +707,7 @@ static inline void tasklet_disable_in_atomic(struct tasklet_struct *t) - static inline void tasklet_disable(struct tasklet_struct *t) - { - tasklet_disable_nosync(t); -- /* Spin wait until all atomic users are converted */ -- tasklet_unlock_spin_wait(t); -+ tasklet_unlock_wait(t); - smp_mb(); - } - --- -2.30.2 - diff --git a/debian/patches-rt/0149-softirq-Add-RT-specific-softirq-accounting.patch b/debian/patches-rt/0149-softirq-Add-RT-specific-softirq-accounting.patch deleted file mode 100644 index 5b118e405..000000000 --- a/debian/patches-rt/0149-softirq-Add-RT-specific-softirq-accounting.patch +++ /dev/null @@ -1,75 +0,0 @@ -From 446e65ed8fcd174687364fa85491144ad58da584 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 9 Mar 2021 09:55:53 +0100 -Subject: [PATCH 149/296] softirq: Add RT specific softirq accounting -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -RT requires the softirq processing and local bottomhalf disabled regions to -be preemptible. Using the normal preempt count based serialization is -therefore not possible because this implicitely disables preemption. - -RT kernels use a per CPU local lock to serialize bottomhalfs. As -local_bh_disable() can nest the lock can only be acquired on the outermost -invocation of local_bh_disable() and released when the nest count becomes -zero. Tasks which hold the local lock can be preempted so its required to -keep track of the nest count per task. - -Add a RT only counter to task struct and adjust the relevant macros in -preempt.h. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Reviewed-by: Frederic Weisbecker <frederic@kernel.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/hardirq.h | 1 + - include/linux/preempt.h | 6 +++++- - include/linux/sched.h | 3 +++ - 3 files changed, 9 insertions(+), 1 deletion(-) - -diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h -index 0926e9ca4d85..76878b357ffa 100644 ---- a/include/linux/hardirq.h -+++ b/include/linux/hardirq.h -@@ -6,6 +6,7 @@ - #include <linux/preempt.h> - #include <linux/lockdep.h> - #include <linux/ftrace_irq.h> -+#include <linux/sched.h> - #include <linux/vtime.h> - #include <asm/hardirq.h> - -diff --git a/include/linux/preempt.h b/include/linux/preempt.h -index 69cc8b64aa3a..9881eac0698f 100644 ---- a/include/linux/preempt.h -+++ b/include/linux/preempt.h -@@ -79,7 +79,11 @@ - - #define nmi_count() (preempt_count() & NMI_MASK) - #define hardirq_count() (preempt_count() & HARDIRQ_MASK) --#define softirq_count() (preempt_count() & SOFTIRQ_MASK) -+#ifdef CONFIG_PREEMPT_RT -+# define softirq_count() (current->softirq_disable_cnt & SOFTIRQ_MASK) -+#else -+# define softirq_count() (preempt_count() & SOFTIRQ_MASK) -+#endif - #define irq_count() (nmi_count() | hardirq_count() | softirq_count()) - - /* -diff --git a/include/linux/sched.h b/include/linux/sched.h -index 6e324d455a0c..1a146988088d 100644 ---- a/include/linux/sched.h -+++ b/include/linux/sched.h -@@ -1039,6 +1039,9 @@ struct task_struct { - int softirq_context; - int irq_config; - #endif -+#ifdef CONFIG_PREEMPT_RT -+ int softirq_disable_cnt; -+#endif - - #ifdef CONFIG_LOCKDEP - # define MAX_LOCK_DEPTH 48UL --- -2.30.2 - diff --git a/debian/patches-rt/0150-irqtime-Make-accounting-correct-on-RT.patch b/debian/patches-rt/0150-irqtime-Make-accounting-correct-on-RT.patch deleted file mode 100644 index b30c1eecd..000000000 --- a/debian/patches-rt/0150-irqtime-Make-accounting-correct-on-RT.patch +++ /dev/null @@ -1,54 +0,0 @@ -From 92e2527c99184eb5921d6256c160bbb1d52a22a9 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 9 Mar 2021 09:55:54 +0100 -Subject: [PATCH 150/296] irqtime: Make accounting correct on RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -vtime_account_irq and irqtime_account_irq() base checks on preempt_count() -which fails on RT because preempt_count() does not contain the softirq -accounting which is seperate on RT. - -These checks do not need the full preempt count as they only operate on the -hard and softirq sections. - -Use irq_count() instead which provides the correct value on both RT and non -RT kernels. The compiler is clever enough to fold the masking for !RT: - - 99b: 65 8b 05 00 00 00 00 mov %gs:0x0(%rip),%eax - - 9a2: 25 ff ff ff 7f and $0x7fffffff,%eax - + 9a2: 25 00 ff ff 00 and $0xffff00,%eax - -Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Reviewed-by: Frederic Weisbecker <frederic@kernel.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/sched/cputime.c | 4 ++-- - 1 file changed, 2 insertions(+), 2 deletions(-) - -diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c -index 5f611658eeab..2c36a5fad589 100644 ---- a/kernel/sched/cputime.c -+++ b/kernel/sched/cputime.c -@@ -60,7 +60,7 @@ void irqtime_account_irq(struct task_struct *curr, unsigned int offset) - cpu = smp_processor_id(); - delta = sched_clock_cpu(cpu) - irqtime->irq_start_time; - irqtime->irq_start_time += delta; -- pc = preempt_count() - offset; -+ pc = irq_count() - offset; - - /* - * We do not account for softirq time from ksoftirqd here. -@@ -421,7 +421,7 @@ void vtime_task_switch(struct task_struct *prev) - - void vtime_account_irq(struct task_struct *tsk, unsigned int offset) - { -- unsigned int pc = preempt_count() - offset; -+ unsigned int pc = irq_count() - offset; - - if (pc & HARDIRQ_OFFSET) { - vtime_account_hardirq(tsk); --- -2.30.2 - diff --git a/debian/patches-rt/0151-softirq-Move-various-protections-into-inline-helpers.patch b/debian/patches-rt/0151-softirq-Move-various-protections-into-inline-helpers.patch deleted file mode 100644 index dd9b4bd51..000000000 --- a/debian/patches-rt/0151-softirq-Move-various-protections-into-inline-helpers.patch +++ /dev/null @@ -1,108 +0,0 @@ -From 5cb6f0af292f2f51c433755993d7828bf51200ad Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 9 Mar 2021 09:55:55 +0100 -Subject: [PATCH 151/296] softirq: Move various protections into inline helpers -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -To allow reuse of the bulk of softirq processing code for RT and to avoid -#ifdeffery all over the place, split protections for various code sections -out into inline helpers so the RT variant can just replace them in one go. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Reviewed-by: Frederic Weisbecker <frederic@kernel.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/softirq.c | 39 ++++++++++++++++++++++++++++++++------- - 1 file changed, 32 insertions(+), 7 deletions(-) - -diff --git a/kernel/softirq.c b/kernel/softirq.c -index fcb201ceed71..87fac6ac0c32 100644 ---- a/kernel/softirq.c -+++ b/kernel/softirq.c -@@ -205,6 +205,32 @@ void __local_bh_enable_ip(unsigned long ip, unsigned int cnt) - } - EXPORT_SYMBOL(__local_bh_enable_ip); - -+static inline void softirq_handle_begin(void) -+{ -+ __local_bh_disable_ip(_RET_IP_, SOFTIRQ_OFFSET); -+} -+ -+static inline void softirq_handle_end(void) -+{ -+ __local_bh_enable(SOFTIRQ_OFFSET); -+ WARN_ON_ONCE(in_interrupt()); -+} -+ -+static inline void ksoftirqd_run_begin(void) -+{ -+ local_irq_disable(); -+} -+ -+static inline void ksoftirqd_run_end(void) -+{ -+ local_irq_enable(); -+} -+ -+static inline bool should_wake_ksoftirqd(void) -+{ -+ return true; -+} -+ - static inline void invoke_softirq(void) - { - if (ksoftirqd_running(local_softirq_pending())) -@@ -317,7 +343,7 @@ asmlinkage __visible void __softirq_entry __do_softirq(void) - - pending = local_softirq_pending(); - -- __local_bh_disable_ip(_RET_IP_, SOFTIRQ_OFFSET); -+ softirq_handle_begin(); - in_hardirq = lockdep_softirq_start(); - account_softirq_enter(current); - -@@ -368,8 +394,7 @@ asmlinkage __visible void __softirq_entry __do_softirq(void) - - account_softirq_exit(current); - lockdep_softirq_end(in_hardirq); -- __local_bh_enable(SOFTIRQ_OFFSET); -- WARN_ON_ONCE(in_interrupt()); -+ softirq_handle_end(); - current_restore_flags(old_flags, PF_MEMALLOC); - } - -@@ -464,7 +489,7 @@ inline void raise_softirq_irqoff(unsigned int nr) - * Otherwise we wake up ksoftirqd to make sure we - * schedule the softirq soon. - */ -- if (!in_interrupt()) -+ if (!in_interrupt() && should_wake_ksoftirqd()) - wakeup_softirqd(); - } - -@@ -692,18 +717,18 @@ static int ksoftirqd_should_run(unsigned int cpu) - - static void run_ksoftirqd(unsigned int cpu) - { -- local_irq_disable(); -+ ksoftirqd_run_begin(); - if (local_softirq_pending()) { - /* - * We can safely run softirq on inline stack, as we are not deep - * in the task stack here. - */ - __do_softirq(); -- local_irq_enable(); -+ ksoftirqd_run_end(); - cond_resched(); - return; - } -- local_irq_enable(); -+ ksoftirqd_run_end(); - } - - #ifdef CONFIG_HOTPLUG_CPU --- -2.30.2 - diff --git a/debian/patches-rt/0152-softirq-Make-softirq-control-and-processing-RT-aware.patch b/debian/patches-rt/0152-softirq-Make-softirq-control-and-processing-RT-aware.patch deleted file mode 100644 index 839eb047f..000000000 --- a/debian/patches-rt/0152-softirq-Make-softirq-control-and-processing-RT-aware.patch +++ /dev/null @@ -1,267 +0,0 @@ -From 8e53351952808efa0b17c850a8710b6cfbb9273b Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 9 Mar 2021 09:55:56 +0100 -Subject: [PATCH 152/296] softirq: Make softirq control and processing RT aware -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Provide a local lock based serialization for soft interrupts on RT which -allows the local_bh_disabled() sections and servicing soft interrupts to be -preemptible. - -Provide the necessary inline helpers which allow to reuse the bulk of the -softirq processing code. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Reviewed-by: Frederic Weisbecker <frederic@kernel.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/bottom_half.h | 2 +- - kernel/softirq.c | 188 ++++++++++++++++++++++++++++++++++-- - 2 files changed, 182 insertions(+), 8 deletions(-) - -diff --git a/include/linux/bottom_half.h b/include/linux/bottom_half.h -index a19519f4241d..e4dd613a070e 100644 ---- a/include/linux/bottom_half.h -+++ b/include/linux/bottom_half.h -@@ -4,7 +4,7 @@ - - #include <linux/preempt.h> - --#ifdef CONFIG_TRACE_IRQFLAGS -+#if defined(CONFIG_PREEMPT_RT) || defined(CONFIG_TRACE_IRQFLAGS) - extern void __local_bh_disable_ip(unsigned long ip, unsigned int cnt); - #else - static __always_inline void __local_bh_disable_ip(unsigned long ip, unsigned int cnt) -diff --git a/kernel/softirq.c b/kernel/softirq.c -index 87fac6ac0c32..ed13f6097de8 100644 ---- a/kernel/softirq.c -+++ b/kernel/softirq.c -@@ -13,6 +13,7 @@ - #include <linux/kernel_stat.h> - #include <linux/interrupt.h> - #include <linux/init.h> -+#include <linux/local_lock.h> - #include <linux/mm.h> - #include <linux/notifier.h> - #include <linux/percpu.h> -@@ -101,20 +102,189 @@ EXPORT_PER_CPU_SYMBOL_GPL(hardirq_context); - #endif - - /* -- * preempt_count and SOFTIRQ_OFFSET usage: -- * - preempt_count is changed by SOFTIRQ_OFFSET on entering or leaving -- * softirq processing. -- * - preempt_count is changed by SOFTIRQ_DISABLE_OFFSET (= 2 * SOFTIRQ_OFFSET) -+ * SOFTIRQ_OFFSET usage: -+ * -+ * On !RT kernels 'count' is the preempt counter, on RT kernels this applies -+ * to a per CPU counter and to task::softirqs_disabled_cnt. -+ * -+ * - count is changed by SOFTIRQ_OFFSET on entering or leaving softirq -+ * processing. -+ * -+ * - count is changed by SOFTIRQ_DISABLE_OFFSET (= 2 * SOFTIRQ_OFFSET) - * on local_bh_disable or local_bh_enable. -+ * - * This lets us distinguish between whether we are currently processing - * softirq and whether we just have bh disabled. - */ -+#ifdef CONFIG_PREEMPT_RT - --#ifdef CONFIG_TRACE_IRQFLAGS - /* -- * This is for softirq.c-internal use, where hardirqs are disabled -+ * RT accounts for BH disabled sections in task::softirqs_disabled_cnt and -+ * also in per CPU softirq_ctrl::cnt. This is necessary to allow tasks in a -+ * softirq disabled section to be preempted. -+ * -+ * The per task counter is used for softirq_count(), in_softirq() and -+ * in_serving_softirqs() because these counts are only valid when the task -+ * holding softirq_ctrl::lock is running. -+ * -+ * The per CPU counter prevents pointless wakeups of ksoftirqd in case that -+ * the task which is in a softirq disabled section is preempted or blocks. -+ */ -+struct softirq_ctrl { -+ local_lock_t lock; -+ int cnt; -+}; -+ -+static DEFINE_PER_CPU(struct softirq_ctrl, softirq_ctrl) = { -+ .lock = INIT_LOCAL_LOCK(softirq_ctrl.lock), -+}; -+ -+void __local_bh_disable_ip(unsigned long ip, unsigned int cnt) -+{ -+ unsigned long flags; -+ int newcnt; -+ -+ WARN_ON_ONCE(in_hardirq()); -+ -+ /* First entry of a task into a BH disabled section? */ -+ if (!current->softirq_disable_cnt) { -+ if (preemptible()) { -+ local_lock(&softirq_ctrl.lock); -+ /* Required to meet the RCU bottomhalf requirements. */ -+ rcu_read_lock(); -+ } else { -+ DEBUG_LOCKS_WARN_ON(this_cpu_read(softirq_ctrl.cnt)); -+ } -+ } -+ -+ /* -+ * Track the per CPU softirq disabled state. On RT this is per CPU -+ * state to allow preemption of bottom half disabled sections. -+ */ -+ newcnt = __this_cpu_add_return(softirq_ctrl.cnt, cnt); -+ /* -+ * Reflect the result in the task state to prevent recursion on the -+ * local lock and to make softirq_count() & al work. -+ */ -+ current->softirq_disable_cnt = newcnt; -+ -+ if (IS_ENABLED(CONFIG_TRACE_IRQFLAGS) && newcnt == cnt) { -+ raw_local_irq_save(flags); -+ lockdep_softirqs_off(ip); -+ raw_local_irq_restore(flags); -+ } -+} -+EXPORT_SYMBOL(__local_bh_disable_ip); -+ -+static void __local_bh_enable(unsigned int cnt, bool unlock) -+{ -+ unsigned long flags; -+ int newcnt; -+ -+ DEBUG_LOCKS_WARN_ON(current->softirq_disable_cnt != -+ this_cpu_read(softirq_ctrl.cnt)); -+ -+ if (IS_ENABLED(CONFIG_TRACE_IRQFLAGS) && softirq_count() == cnt) { -+ raw_local_irq_save(flags); -+ lockdep_softirqs_on(_RET_IP_); -+ raw_local_irq_restore(flags); -+ } -+ -+ newcnt = __this_cpu_sub_return(softirq_ctrl.cnt, cnt); -+ current->softirq_disable_cnt = newcnt; -+ -+ if (!newcnt && unlock) { -+ rcu_read_unlock(); -+ local_unlock(&softirq_ctrl.lock); -+ } -+} -+ -+void __local_bh_enable_ip(unsigned long ip, unsigned int cnt) -+{ -+ bool preempt_on = preemptible(); -+ unsigned long flags; -+ u32 pending; -+ int curcnt; -+ -+ WARN_ON_ONCE(in_irq()); -+ lockdep_assert_irqs_enabled(); -+ -+ local_irq_save(flags); -+ curcnt = __this_cpu_read(softirq_ctrl.cnt); -+ -+ /* -+ * If this is not reenabling soft interrupts, no point in trying to -+ * run pending ones. -+ */ -+ if (curcnt != cnt) -+ goto out; -+ -+ pending = local_softirq_pending(); -+ if (!pending || ksoftirqd_running(pending)) -+ goto out; -+ -+ /* -+ * If this was called from non preemptible context, wake up the -+ * softirq daemon. -+ */ -+ if (!preempt_on) { -+ wakeup_softirqd(); -+ goto out; -+ } -+ -+ /* -+ * Adjust softirq count to SOFTIRQ_OFFSET which makes -+ * in_serving_softirq() become true. -+ */ -+ cnt = SOFTIRQ_OFFSET; -+ __local_bh_enable(cnt, false); -+ __do_softirq(); -+ -+out: -+ __local_bh_enable(cnt, preempt_on); -+ local_irq_restore(flags); -+} -+EXPORT_SYMBOL(__local_bh_enable_ip); -+ -+/* -+ * Invoked from ksoftirqd_run() outside of the interrupt disabled section -+ * to acquire the per CPU local lock for reentrancy protection. -+ */ -+static inline void ksoftirqd_run_begin(void) -+{ -+ __local_bh_disable_ip(_RET_IP_, SOFTIRQ_OFFSET); -+ local_irq_disable(); -+} -+ -+/* Counterpart to ksoftirqd_run_begin() */ -+static inline void ksoftirqd_run_end(void) -+{ -+ __local_bh_enable(SOFTIRQ_OFFSET, true); -+ WARN_ON_ONCE(in_interrupt()); -+ local_irq_enable(); -+} -+ -+static inline void softirq_handle_begin(void) { } -+static inline void softirq_handle_end(void) { } -+ -+static inline bool should_wake_ksoftirqd(void) -+{ -+ return !this_cpu_read(softirq_ctrl.cnt); -+} -+ -+static inline void invoke_softirq(void) -+{ -+ if (should_wake_ksoftirqd()) -+ wakeup_softirqd(); -+} -+ -+#else /* CONFIG_PREEMPT_RT */ -+ -+/* -+ * This one is for softirq.c-internal use, where hardirqs are disabled - * legitimately: - */ -+#ifdef CONFIG_TRACE_IRQFLAGS - void __local_bh_disable_ip(unsigned long ip, unsigned int cnt) - { - unsigned long flags; -@@ -275,6 +445,8 @@ asmlinkage __visible void do_softirq(void) - local_irq_restore(flags); - } - -+#endif /* !CONFIG_PREEMPT_RT */ -+ - /* - * We restart softirq processing for at most MAX_SOFTIRQ_RESTART times, - * but break the loop if need_resched() is set or after 2 ms. -@@ -379,8 +551,10 @@ asmlinkage __visible void __softirq_entry __do_softirq(void) - pending >>= softirq_bit; - } - -- if (__this_cpu_read(ksoftirqd) == current) -+ if (!IS_ENABLED(CONFIG_PREEMPT_RT) && -+ __this_cpu_read(ksoftirqd) == current) - rcu_softirq_qs(); -+ - local_irq_disable(); - - pending = local_softirq_pending(); --- -2.30.2 - diff --git a/debian/patches-rt/0153-tick-sched-Prevent-false-positive-softirq-pending-wa.patch b/debian/patches-rt/0153-tick-sched-Prevent-false-positive-softirq-pending-wa.patch deleted file mode 100644 index 9063a29d2..000000000 --- a/debian/patches-rt/0153-tick-sched-Prevent-false-positive-softirq-pending-wa.patch +++ /dev/null @@ -1,84 +0,0 @@ -From 8d9fd475e1b06390773957dc77b27ea66adde354 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 9 Mar 2021 09:55:57 +0100 -Subject: [PATCH 153/296] tick/sched: Prevent false positive softirq pending - warnings on RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -On RT a task which has soft interrupts disabled can block on a lock and -schedule out to idle while soft interrupts are pending. This triggers the -warning in the NOHZ idle code which complains about going idle with pending -soft interrupts. But as the task is blocked soft interrupt processing is -temporarily blocked as well which means that such a warning is a false -positive. - -To prevent that check the per CPU state which indicates that a scheduled -out task has soft interrupts disabled. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Reviewed-by: Frederic Weisbecker <frederic@kernel.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/bottom_half.h | 6 ++++++ - kernel/softirq.c | 15 +++++++++++++++ - kernel/time/tick-sched.c | 2 +- - 3 files changed, 22 insertions(+), 1 deletion(-) - -diff --git a/include/linux/bottom_half.h b/include/linux/bottom_half.h -index e4dd613a070e..eed86eb0a1de 100644 ---- a/include/linux/bottom_half.h -+++ b/include/linux/bottom_half.h -@@ -32,4 +32,10 @@ static inline void local_bh_enable(void) - __local_bh_enable_ip(_THIS_IP_, SOFTIRQ_DISABLE_OFFSET); - } - -+#ifdef CONFIG_PREEMPT_RT -+extern bool local_bh_blocked(void); -+#else -+static inline bool local_bh_blocked(void) { return false; } -+#endif -+ - #endif /* _LINUX_BH_H */ -diff --git a/kernel/softirq.c b/kernel/softirq.c -index ed13f6097de8..c9adc5c46248 100644 ---- a/kernel/softirq.c -+++ b/kernel/softirq.c -@@ -139,6 +139,21 @@ static DEFINE_PER_CPU(struct softirq_ctrl, softirq_ctrl) = { - .lock = INIT_LOCAL_LOCK(softirq_ctrl.lock), - }; - -+/** -+ * local_bh_blocked() - Check for idle whether BH processing is blocked -+ * -+ * Returns false if the per CPU softirq::cnt is 0 otherwise true. -+ * -+ * This is invoked from the idle task to guard against false positive -+ * softirq pending warnings, which would happen when the task which holds -+ * softirq_ctrl::lock was the only running task on the CPU and blocks on -+ * some other lock. -+ */ -+bool local_bh_blocked(void) -+{ -+ return __this_cpu_read(softirq_ctrl.cnt) != 0; -+} -+ - void __local_bh_disable_ip(unsigned long ip, unsigned int cnt) - { - unsigned long flags; -diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c -index e8d351b7f9b0..3e3130a7d44e 100644 ---- a/kernel/time/tick-sched.c -+++ b/kernel/time/tick-sched.c -@@ -925,7 +925,7 @@ static bool can_stop_idle_tick(int cpu, struct tick_sched *ts) - if (unlikely(local_softirq_pending())) { - static int ratelimit; - -- if (ratelimit < 10 && -+ if (ratelimit < 10 && !local_bh_blocked() && - (local_softirq_pending() & SOFTIRQ_STOP_IDLE_MASK)) { - pr_warn("NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #%02x!!!\n", - (unsigned int) local_softirq_pending()); --- -2.30.2 - diff --git a/debian/patches-rt/0154-rcu-Prevent-false-positive-softirq-warning-on-RT.patch b/debian/patches-rt/0154-rcu-Prevent-false-positive-softirq-warning-on-RT.patch deleted file mode 100644 index 2166a4194..000000000 --- a/debian/patches-rt/0154-rcu-Prevent-false-positive-softirq-warning-on-RT.patch +++ /dev/null @@ -1,35 +0,0 @@ -From c1bedae54f7b7cfd8fe3f2d26a85badfafb15b87 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 9 Mar 2021 09:55:58 +0100 -Subject: [PATCH 154/296] rcu: Prevent false positive softirq warning on RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Soft interrupt disabled sections can legitimately be preempted or schedule -out when blocking on a lock on RT enabled kernels so the RCU preempt check -warning has to be disabled for RT kernels. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Reviewed-by: Paul E. McKenney <paulmck@kernel.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/rcupdate.h | 3 ++- - 1 file changed, 2 insertions(+), 1 deletion(-) - -diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h -index c5adba5e79e7..fc5ba83fc818 100644 ---- a/include/linux/rcupdate.h -+++ b/include/linux/rcupdate.h -@@ -327,7 +327,8 @@ static inline void rcu_preempt_sleep_check(void) { } - #define rcu_sleep_check() \ - do { \ - rcu_preempt_sleep_check(); \ -- RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map), \ -+ if (!IS_ENABLED(CONFIG_PREEMPT_RT)) \ -+ RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map), \ - "Illegal context switch in RCU-bh read-side critical section"); \ - RCU_LOCKDEP_WARN(lock_is_held(&rcu_sched_lock_map), \ - "Illegal context switch in RCU-sched read-side critical section"); \ --- -2.30.2 - diff --git a/debian/patches-rt/0155-chelsio-cxgb-Replace-the-workqueue-with-threaded-int.patch b/debian/patches-rt/0155-chelsio-cxgb-Replace-the-workqueue-with-threaded-int.patch deleted file mode 100644 index b54957fa3..000000000 --- a/debian/patches-rt/0155-chelsio-cxgb-Replace-the-workqueue-with-threaded-int.patch +++ /dev/null @@ -1,271 +0,0 @@ -From 7740cb6404f4997bd1188a2f9872ff4bfd056443 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Tue, 2 Feb 2021 18:01:03 +0100 -Subject: [PATCH 155/296] chelsio: cxgb: Replace the workqueue with threaded - interrupt -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The external interrupt (F_PL_INTR_EXT) needs to be handled in a process -context and this is accomplished by utilizing a workqueue. - -The process context can also be provided by a threaded interrupt instead -of a workqueue. The threaded interrupt can be used later for other -interrupt related processing which require non-atomic context without -using yet another workqueue. free_irq() also ensures that the thread is -done which is currently missing (the worker could continue after the -module has been removed). - -Save pending flags in pending_thread_intr. Use the same mechanism -to disable F_PL_INTR_EXT as interrupt source like it is used before the -worker is scheduled. Enable the interrupt again once -t1_elmer0_ext_intr_handler() is done. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - drivers/net/ethernet/chelsio/cxgb/common.h | 5 +-- - drivers/net/ethernet/chelsio/cxgb/cxgb2.c | 44 ++-------------------- - drivers/net/ethernet/chelsio/cxgb/sge.c | 33 ++++++++++++++-- - drivers/net/ethernet/chelsio/cxgb/sge.h | 1 + - drivers/net/ethernet/chelsio/cxgb/subr.c | 26 +++++++++---- - 5 files changed, 55 insertions(+), 54 deletions(-) - -diff --git a/drivers/net/ethernet/chelsio/cxgb/common.h b/drivers/net/ethernet/chelsio/cxgb/common.h -index 6475060649e9..e999a9b9fe6c 100644 ---- a/drivers/net/ethernet/chelsio/cxgb/common.h -+++ b/drivers/net/ethernet/chelsio/cxgb/common.h -@@ -238,7 +238,6 @@ struct adapter { - int msg_enable; - u32 mmio_len; - -- struct work_struct ext_intr_handler_task; - struct adapter_params params; - - /* Terminator modules. */ -@@ -257,6 +256,7 @@ struct adapter { - - /* guards async operations */ - spinlock_t async_lock ____cacheline_aligned; -+ u32 pending_thread_intr; - u32 slow_intr_mask; - int t1powersave; - }; -@@ -334,8 +334,7 @@ void t1_interrupts_enable(adapter_t *adapter); - void t1_interrupts_disable(adapter_t *adapter); - void t1_interrupts_clear(adapter_t *adapter); - int t1_elmer0_ext_intr_handler(adapter_t *adapter); --void t1_elmer0_ext_intr(adapter_t *adapter); --int t1_slow_intr_handler(adapter_t *adapter); -+irqreturn_t t1_slow_intr_handler(adapter_t *adapter); - - int t1_link_start(struct cphy *phy, struct cmac *mac, struct link_config *lc); - const struct board_info *t1_get_board_info(unsigned int board_id); -diff --git a/drivers/net/ethernet/chelsio/cxgb/cxgb2.c b/drivers/net/ethernet/chelsio/cxgb/cxgb2.c -index 0e4a0f413960..bd6f3c532d72 100644 ---- a/drivers/net/ethernet/chelsio/cxgb/cxgb2.c -+++ b/drivers/net/ethernet/chelsio/cxgb/cxgb2.c -@@ -211,9 +211,10 @@ static int cxgb_up(struct adapter *adapter) - t1_interrupts_clear(adapter); - - adapter->params.has_msi = !disable_msi && !pci_enable_msi(adapter->pdev); -- err = request_irq(adapter->pdev->irq, t1_interrupt, -- adapter->params.has_msi ? 0 : IRQF_SHARED, -- adapter->name, adapter); -+ err = request_threaded_irq(adapter->pdev->irq, t1_interrupt, -+ t1_interrupt_thread, -+ adapter->params.has_msi ? 0 : IRQF_SHARED, -+ adapter->name, adapter); - if (err) { - if (adapter->params.has_msi) - pci_disable_msi(adapter->pdev); -@@ -916,41 +917,6 @@ static void mac_stats_task(struct work_struct *work) - spin_unlock(&adapter->work_lock); - } - --/* -- * Processes elmer0 external interrupts in process context. -- */ --static void ext_intr_task(struct work_struct *work) --{ -- struct adapter *adapter = -- container_of(work, struct adapter, ext_intr_handler_task); -- -- t1_elmer0_ext_intr_handler(adapter); -- -- /* Now reenable external interrupts */ -- spin_lock_irq(&adapter->async_lock); -- adapter->slow_intr_mask |= F_PL_INTR_EXT; -- writel(F_PL_INTR_EXT, adapter->regs + A_PL_CAUSE); -- writel(adapter->slow_intr_mask | F_PL_INTR_SGE_DATA, -- adapter->regs + A_PL_ENABLE); -- spin_unlock_irq(&adapter->async_lock); --} -- --/* -- * Interrupt-context handler for elmer0 external interrupts. -- */ --void t1_elmer0_ext_intr(struct adapter *adapter) --{ -- /* -- * Schedule a task to handle external interrupts as we require -- * a process context. We disable EXT interrupts in the interim -- * and let the task reenable them when it's done. -- */ -- adapter->slow_intr_mask &= ~F_PL_INTR_EXT; -- writel(adapter->slow_intr_mask | F_PL_INTR_SGE_DATA, -- adapter->regs + A_PL_ENABLE); -- schedule_work(&adapter->ext_intr_handler_task); --} -- - void t1_fatal_err(struct adapter *adapter) - { - if (adapter->flags & FULL_INIT_DONE) { -@@ -1062,8 +1028,6 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent) - spin_lock_init(&adapter->async_lock); - spin_lock_init(&adapter->mac_lock); - -- INIT_WORK(&adapter->ext_intr_handler_task, -- ext_intr_task); - INIT_DELAYED_WORK(&adapter->stats_update_task, - mac_stats_task); - -diff --git a/drivers/net/ethernet/chelsio/cxgb/sge.c b/drivers/net/ethernet/chelsio/cxgb/sge.c -index 2d9c2b5a690a..5aef9ae1ecfe 100644 ---- a/drivers/net/ethernet/chelsio/cxgb/sge.c -+++ b/drivers/net/ethernet/chelsio/cxgb/sge.c -@@ -1619,11 +1619,38 @@ int t1_poll(struct napi_struct *napi, int budget) - return work_done; - } - -+irqreturn_t t1_interrupt_thread(int irq, void *data) -+{ -+ struct adapter *adapter = data; -+ u32 pending_thread_intr; -+ -+ spin_lock_irq(&adapter->async_lock); -+ pending_thread_intr = adapter->pending_thread_intr; -+ adapter->pending_thread_intr = 0; -+ spin_unlock_irq(&adapter->async_lock); -+ -+ if (!pending_thread_intr) -+ return IRQ_NONE; -+ -+ if (pending_thread_intr & F_PL_INTR_EXT) -+ t1_elmer0_ext_intr_handler(adapter); -+ -+ spin_lock_irq(&adapter->async_lock); -+ adapter->slow_intr_mask |= F_PL_INTR_EXT; -+ -+ writel(F_PL_INTR_EXT, adapter->regs + A_PL_CAUSE); -+ writel(adapter->slow_intr_mask | F_PL_INTR_SGE_DATA, -+ adapter->regs + A_PL_ENABLE); -+ spin_unlock_irq(&adapter->async_lock); -+ -+ return IRQ_HANDLED; -+} -+ - irqreturn_t t1_interrupt(int irq, void *data) - { - struct adapter *adapter = data; - struct sge *sge = adapter->sge; -- int handled; -+ irqreturn_t handled; - - if (likely(responses_pending(adapter))) { - writel(F_PL_INTR_SGE_DATA, adapter->regs + A_PL_CAUSE); -@@ -1645,10 +1672,10 @@ irqreturn_t t1_interrupt(int irq, void *data) - handled = t1_slow_intr_handler(adapter); - spin_unlock(&adapter->async_lock); - -- if (!handled) -+ if (handled == IRQ_NONE) - sge->stats.unhandled_irqs++; - -- return IRQ_RETVAL(handled != 0); -+ return handled; - } - - /* -diff --git a/drivers/net/ethernet/chelsio/cxgb/sge.h b/drivers/net/ethernet/chelsio/cxgb/sge.h -index a1ba591b3431..76516d2a8aa9 100644 ---- a/drivers/net/ethernet/chelsio/cxgb/sge.h -+++ b/drivers/net/ethernet/chelsio/cxgb/sge.h -@@ -74,6 +74,7 @@ struct sge *t1_sge_create(struct adapter *, struct sge_params *); - int t1_sge_configure(struct sge *, struct sge_params *); - int t1_sge_set_coalesce_params(struct sge *, struct sge_params *); - void t1_sge_destroy(struct sge *); -+irqreturn_t t1_interrupt_thread(int irq, void *data); - irqreturn_t t1_interrupt(int irq, void *cookie); - int t1_poll(struct napi_struct *, int); - -diff --git a/drivers/net/ethernet/chelsio/cxgb/subr.c b/drivers/net/ethernet/chelsio/cxgb/subr.c -index ea0f8741d7cf..d90ad07ff1a4 100644 ---- a/drivers/net/ethernet/chelsio/cxgb/subr.c -+++ b/drivers/net/ethernet/chelsio/cxgb/subr.c -@@ -210,7 +210,7 @@ static int fpga_phy_intr_handler(adapter_t *adapter) - /* - * Slow path interrupt handler for FPGAs. - */ --static int fpga_slow_intr(adapter_t *adapter) -+static irqreturn_t fpga_slow_intr(adapter_t *adapter) - { - u32 cause = readl(adapter->regs + A_PL_CAUSE); - -@@ -238,7 +238,7 @@ static int fpga_slow_intr(adapter_t *adapter) - if (cause) - writel(cause, adapter->regs + A_PL_CAUSE); - -- return cause != 0; -+ return cause == 0 ? IRQ_NONE : IRQ_HANDLED; - } - #endif - -@@ -842,13 +842,14 @@ void t1_interrupts_clear(adapter_t* adapter) - /* - * Slow path interrupt handler for ASICs. - */ --static int asic_slow_intr(adapter_t *adapter) -+static irqreturn_t asic_slow_intr(adapter_t *adapter) - { - u32 cause = readl(adapter->regs + A_PL_CAUSE); -+ irqreturn_t ret = IRQ_HANDLED; - - cause &= adapter->slow_intr_mask; - if (!cause) -- return 0; -+ return IRQ_NONE; - if (cause & F_PL_INTR_SGE_ERR) - t1_sge_intr_error_handler(adapter->sge); - if (cause & F_PL_INTR_TP) -@@ -857,16 +858,25 @@ static int asic_slow_intr(adapter_t *adapter) - t1_espi_intr_handler(adapter->espi); - if (cause & F_PL_INTR_PCIX) - t1_pci_intr_handler(adapter); -- if (cause & F_PL_INTR_EXT) -- t1_elmer0_ext_intr(adapter); -+ if (cause & F_PL_INTR_EXT) { -+ /* Wake the threaded interrupt to handle external interrupts as -+ * we require a process context. We disable EXT interrupts in -+ * the interim and let the thread reenable them when it's done. -+ */ -+ adapter->pending_thread_intr |= F_PL_INTR_EXT; -+ adapter->slow_intr_mask &= ~F_PL_INTR_EXT; -+ writel(adapter->slow_intr_mask | F_PL_INTR_SGE_DATA, -+ adapter->regs + A_PL_ENABLE); -+ ret = IRQ_WAKE_THREAD; -+ } - - /* Clear the interrupts just processed. */ - writel(cause, adapter->regs + A_PL_CAUSE); - readl(adapter->regs + A_PL_CAUSE); /* flush writes */ -- return 1; -+ return ret; - } - --int t1_slow_intr_handler(adapter_t *adapter) -+irqreturn_t t1_slow_intr_handler(adapter_t *adapter) - { - #ifdef CONFIG_CHELSIO_T1_1G - if (!t1_is_asic(adapter)) --- -2.30.2 - diff --git a/debian/patches-rt/0156-chelsio-cxgb-Disable-the-card-on-error-in-threaded-i.patch b/debian/patches-rt/0156-chelsio-cxgb-Disable-the-card-on-error-in-threaded-i.patch deleted file mode 100644 index d6a037342..000000000 --- a/debian/patches-rt/0156-chelsio-cxgb-Disable-the-card-on-error-in-threaded-i.patch +++ /dev/null @@ -1,215 +0,0 @@ -From fb464c386dae925d57dc7d059615a138d3a06f3a Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Tue, 2 Feb 2021 18:01:04 +0100 -Subject: [PATCH 156/296] chelsio: cxgb: Disable the card on error in threaded - interrupt -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -t1_fatal_err() is invoked from the interrupt handler. The bad part is -that it invokes (via t1_sge_stop()) del_timer_sync() and tasklet_kill(). -Both functions must not be called from an interrupt because it is -possible that it will wait for the completion of the timer/tasklet it -just interrupted. - -In case of a fatal error, use t1_interrupts_disable() to disable all -interrupt sources and then wake the interrupt thread with -F_PL_INTR_SGE_ERR as pending flag. The threaded-interrupt will stop the -card via t1_sge_stop() and not re-enable the interrupts again. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - drivers/net/ethernet/chelsio/cxgb/common.h | 1 - - drivers/net/ethernet/chelsio/cxgb/cxgb2.c | 10 ------ - drivers/net/ethernet/chelsio/cxgb/sge.c | 20 +++++++++--- - drivers/net/ethernet/chelsio/cxgb/sge.h | 2 +- - drivers/net/ethernet/chelsio/cxgb/subr.c | 38 +++++++++++++++------- - 5 files changed, 44 insertions(+), 27 deletions(-) - -diff --git a/drivers/net/ethernet/chelsio/cxgb/common.h b/drivers/net/ethernet/chelsio/cxgb/common.h -index e999a9b9fe6c..0321be77366c 100644 ---- a/drivers/net/ethernet/chelsio/cxgb/common.h -+++ b/drivers/net/ethernet/chelsio/cxgb/common.h -@@ -346,7 +346,6 @@ int t1_get_board_rev(adapter_t *adapter, const struct board_info *bi, - int t1_init_hw_modules(adapter_t *adapter); - int t1_init_sw_modules(adapter_t *adapter, const struct board_info *bi); - void t1_free_sw_modules(adapter_t *adapter); --void t1_fatal_err(adapter_t *adapter); - void t1_link_changed(adapter_t *adapter, int port_id); - void t1_link_negotiated(adapter_t *adapter, int port_id, int link_stat, - int speed, int duplex, int pause); -diff --git a/drivers/net/ethernet/chelsio/cxgb/cxgb2.c b/drivers/net/ethernet/chelsio/cxgb/cxgb2.c -index bd6f3c532d72..512da98019c6 100644 ---- a/drivers/net/ethernet/chelsio/cxgb/cxgb2.c -+++ b/drivers/net/ethernet/chelsio/cxgb/cxgb2.c -@@ -917,16 +917,6 @@ static void mac_stats_task(struct work_struct *work) - spin_unlock(&adapter->work_lock); - } - --void t1_fatal_err(struct adapter *adapter) --{ -- if (adapter->flags & FULL_INIT_DONE) { -- t1_sge_stop(adapter->sge); -- t1_interrupts_disable(adapter); -- } -- pr_alert("%s: encountered fatal error, operation suspended\n", -- adapter->name); --} -- - static const struct net_device_ops cxgb_netdev_ops = { - .ndo_open = cxgb_open, - .ndo_stop = cxgb_close, -diff --git a/drivers/net/ethernet/chelsio/cxgb/sge.c b/drivers/net/ethernet/chelsio/cxgb/sge.c -index 5aef9ae1ecfe..cda01f22c71c 100644 ---- a/drivers/net/ethernet/chelsio/cxgb/sge.c -+++ b/drivers/net/ethernet/chelsio/cxgb/sge.c -@@ -940,10 +940,11 @@ void t1_sge_intr_clear(struct sge *sge) - /* - * SGE 'Error' interrupt handler - */ --int t1_sge_intr_error_handler(struct sge *sge) -+bool t1_sge_intr_error_handler(struct sge *sge) - { - struct adapter *adapter = sge->adapter; - u32 cause = readl(adapter->regs + A_SG_INT_CAUSE); -+ bool wake = false; - - if (adapter->port[0].dev->hw_features & NETIF_F_TSO) - cause &= ~F_PACKET_TOO_BIG; -@@ -967,11 +968,14 @@ int t1_sge_intr_error_handler(struct sge *sge) - sge->stats.pkt_mismatch++; - pr_alert("%s: SGE packet mismatch\n", adapter->name); - } -- if (cause & SGE_INT_FATAL) -- t1_fatal_err(adapter); -+ if (cause & SGE_INT_FATAL) { -+ t1_interrupts_disable(adapter); -+ adapter->pending_thread_intr |= F_PL_INTR_SGE_ERR; -+ wake = true; -+ } - - writel(cause, adapter->regs + A_SG_INT_CAUSE); -- return 0; -+ return wake; - } - - const struct sge_intr_counts *t1_sge_get_intr_counts(const struct sge *sge) -@@ -1635,6 +1639,14 @@ irqreturn_t t1_interrupt_thread(int irq, void *data) - if (pending_thread_intr & F_PL_INTR_EXT) - t1_elmer0_ext_intr_handler(adapter); - -+ /* This error is fatal, interrupts remain off */ -+ if (pending_thread_intr & F_PL_INTR_SGE_ERR) { -+ pr_alert("%s: encountered fatal error, operation suspended\n", -+ adapter->name); -+ t1_sge_stop(adapter->sge); -+ return IRQ_HANDLED; -+ } -+ - spin_lock_irq(&adapter->async_lock); - adapter->slow_intr_mask |= F_PL_INTR_EXT; - -diff --git a/drivers/net/ethernet/chelsio/cxgb/sge.h b/drivers/net/ethernet/chelsio/cxgb/sge.h -index 76516d2a8aa9..716705b96f26 100644 ---- a/drivers/net/ethernet/chelsio/cxgb/sge.h -+++ b/drivers/net/ethernet/chelsio/cxgb/sge.h -@@ -82,7 +82,7 @@ netdev_tx_t t1_start_xmit(struct sk_buff *skb, struct net_device *dev); - void t1_vlan_mode(struct adapter *adapter, netdev_features_t features); - void t1_sge_start(struct sge *); - void t1_sge_stop(struct sge *); --int t1_sge_intr_error_handler(struct sge *); -+bool t1_sge_intr_error_handler(struct sge *sge); - void t1_sge_intr_enable(struct sge *); - void t1_sge_intr_disable(struct sge *); - void t1_sge_intr_clear(struct sge *); -diff --git a/drivers/net/ethernet/chelsio/cxgb/subr.c b/drivers/net/ethernet/chelsio/cxgb/subr.c -index d90ad07ff1a4..310add28fcf5 100644 ---- a/drivers/net/ethernet/chelsio/cxgb/subr.c -+++ b/drivers/net/ethernet/chelsio/cxgb/subr.c -@@ -170,7 +170,7 @@ void t1_link_changed(adapter_t *adapter, int port_id) - t1_link_negotiated(adapter, port_id, link_ok, speed, duplex, fc); - } - --static int t1_pci_intr_handler(adapter_t *adapter) -+static bool t1_pci_intr_handler(adapter_t *adapter) - { - u32 pcix_cause; - -@@ -179,9 +179,13 @@ static int t1_pci_intr_handler(adapter_t *adapter) - if (pcix_cause) { - pci_write_config_dword(adapter->pdev, A_PCICFG_INTR_CAUSE, - pcix_cause); -- t1_fatal_err(adapter); /* PCI errors are fatal */ -+ /* PCI errors are fatal */ -+ t1_interrupts_disable(adapter); -+ adapter->pending_thread_intr |= F_PL_INTR_SGE_ERR; -+ pr_alert("%s: PCI error encountered.\n", adapter->name); -+ return true; - } -- return 0; -+ return false; - } - - #ifdef CONFIG_CHELSIO_T1_1G -@@ -213,10 +217,13 @@ static int fpga_phy_intr_handler(adapter_t *adapter) - static irqreturn_t fpga_slow_intr(adapter_t *adapter) - { - u32 cause = readl(adapter->regs + A_PL_CAUSE); -+ irqreturn_t ret = IRQ_NONE; - - cause &= ~F_PL_INTR_SGE_DATA; -- if (cause & F_PL_INTR_SGE_ERR) -- t1_sge_intr_error_handler(adapter->sge); -+ if (cause & F_PL_INTR_SGE_ERR) { -+ if (t1_sge_intr_error_handler(adapter->sge)) -+ ret = IRQ_WAKE_THREAD; -+ } - - if (cause & FPGA_PCIX_INTERRUPT_GMAC) - fpga_phy_intr_handler(adapter); -@@ -231,13 +238,18 @@ static irqreturn_t fpga_slow_intr(adapter_t *adapter) - /* Clear TP interrupt */ - writel(tp_cause, adapter->regs + FPGA_TP_ADDR_INTERRUPT_CAUSE); - } -- if (cause & FPGA_PCIX_INTERRUPT_PCIX) -- t1_pci_intr_handler(adapter); -+ if (cause & FPGA_PCIX_INTERRUPT_PCIX) { -+ if (t1_pci_intr_handler(adapter)) -+ ret = IRQ_WAKE_THREAD; -+ } - - /* Clear the interrupts just processed. */ - if (cause) - writel(cause, adapter->regs + A_PL_CAUSE); - -+ if (ret != IRQ_NONE) -+ return ret; -+ - return cause == 0 ? IRQ_NONE : IRQ_HANDLED; - } - #endif -@@ -850,14 +862,18 @@ static irqreturn_t asic_slow_intr(adapter_t *adapter) - cause &= adapter->slow_intr_mask; - if (!cause) - return IRQ_NONE; -- if (cause & F_PL_INTR_SGE_ERR) -- t1_sge_intr_error_handler(adapter->sge); -+ if (cause & F_PL_INTR_SGE_ERR) { -+ if (t1_sge_intr_error_handler(adapter->sge)) -+ ret = IRQ_WAKE_THREAD; -+ } - if (cause & F_PL_INTR_TP) - t1_tp_intr_handler(adapter->tp); - if (cause & F_PL_INTR_ESPI) - t1_espi_intr_handler(adapter->espi); -- if (cause & F_PL_INTR_PCIX) -- t1_pci_intr_handler(adapter); -+ if (cause & F_PL_INTR_PCIX) { -+ if (t1_pci_intr_handler(adapter)) -+ ret = IRQ_WAKE_THREAD; -+ } - if (cause & F_PL_INTR_EXT) { - /* Wake the threaded interrupt to handle external interrupts as - * we require a process context. We disable EXT interrupts in --- -2.30.2 - diff --git a/debian/patches-rt/0157-x86-fpu-Simplify-fpregs_-un-lock.patch b/debian/patches-rt/0157-x86-fpu-Simplify-fpregs_-un-lock.patch deleted file mode 100644 index 32cbdd85b..000000000 --- a/debian/patches-rt/0157-x86-fpu-Simplify-fpregs_-un-lock.patch +++ /dev/null @@ -1,47 +0,0 @@ -From 144da207a10b017ea3fefc20b860c4b9661a77b0 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 27 Oct 2020 11:09:50 +0100 -Subject: [PATCH 157/296] x86/fpu: Simplify fpregs_[un]lock() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -There is no point in disabling preemption and then disabling bottom -halfs. - -Just disabling bottom halfs is sufficient as it implicitly disables -preemption on !RT kernels. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Link: https://lore.kernel.org/r/20201027101349.455380473@linutronix.de -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/x86/include/asm/fpu/api.h | 5 +++-- - 1 file changed, 3 insertions(+), 2 deletions(-) - -diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h -index 38f4936045ab..78b2c9cbd94d 100644 ---- a/arch/x86/include/asm/fpu/api.h -+++ b/arch/x86/include/asm/fpu/api.h -@@ -40,17 +40,18 @@ static inline void kernel_fpu_begin(void) - * A context switch will (and softirq might) save CPU's FPU registers to - * fpu->state and set TIF_NEED_FPU_LOAD leaving CPU's FPU registers in - * a random state. -+ * -+ * local_bh_disable() protects against both preemption and soft interrupts -+ * on !RT kernels. - */ - static inline void fpregs_lock(void) - { -- preempt_disable(); - local_bh_disable(); - } - - static inline void fpregs_unlock(void) - { - local_bh_enable(); -- preempt_enable(); - } - - #ifdef CONFIG_X86_DEBUG_FPU --- -2.30.2 - diff --git a/debian/patches-rt/0158-x86-fpu-Make-kernel-FPU-protection-RT-friendly.patch b/debian/patches-rt/0158-x86-fpu-Make-kernel-FPU-protection-RT-friendly.patch deleted file mode 100644 index 285a63701..000000000 --- a/debian/patches-rt/0158-x86-fpu-Make-kernel-FPU-protection-RT-friendly.patch +++ /dev/null @@ -1,64 +0,0 @@ -From 4918a84000ae19a1db7790609037f15839e6b6fd Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 27 Oct 2020 11:09:51 +0100 -Subject: [PATCH 158/296] x86/fpu: Make kernel FPU protection RT friendly -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Non RT kernels need to protect FPU against preemption and bottom half -processing. This is achieved by disabling bottom halfs via -local_bh_disable() which implictly disables preemption. - -On RT kernels this protection mechanism is not sufficient because -local_bh_disable() does not disable preemption. It serializes bottom half -related processing via a CPU local lock. - -As bottom halfs are running always in thread context on RT kernels -disabling preemption is the proper choice as it implicitly prevents bottom -half processing. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Link: https://lore.kernel.org/r/20201027101349.588965083@linutronix.de -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/x86/include/asm/fpu/api.h | 18 ++++++++++++++++-- - 1 file changed, 16 insertions(+), 2 deletions(-) - -diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h -index 78b2c9cbd94d..67a4f1cb2aac 100644 ---- a/arch/x86/include/asm/fpu/api.h -+++ b/arch/x86/include/asm/fpu/api.h -@@ -43,15 +43,29 @@ static inline void kernel_fpu_begin(void) - * - * local_bh_disable() protects against both preemption and soft interrupts - * on !RT kernels. -+ * -+ * On RT kernels local_bh_disable() is not sufficient because it only -+ * serializes soft interrupt related sections via a local lock, but stays -+ * preemptible. Disabling preemption is the right choice here as bottom -+ * half processing is always in thread context on RT kernels so it -+ * implicitly prevents bottom half processing as well. -+ * -+ * Disabling preemption also serializes against kernel_fpu_begin(). - */ - static inline void fpregs_lock(void) - { -- local_bh_disable(); -+ if (!IS_ENABLED(CONFIG_PREEMPT_RT)) -+ local_bh_disable(); -+ else -+ preempt_disable(); - } - - static inline void fpregs_unlock(void) - { -- local_bh_enable(); -+ if (!IS_ENABLED(CONFIG_PREEMPT_RT)) -+ local_bh_enable(); -+ else -+ preempt_enable(); - } - - #ifdef CONFIG_X86_DEBUG_FPU --- -2.30.2 - diff --git a/debian/patches-rt/0159-locking-rtmutex-Remove-cruft.patch b/debian/patches-rt/0159-locking-rtmutex-Remove-cruft.patch deleted file mode 100644 index 42bd942f0..000000000 --- a/debian/patches-rt/0159-locking-rtmutex-Remove-cruft.patch +++ /dev/null @@ -1,99 +0,0 @@ -From 28cc7d452e4d3cef8d858c4138cf965402094e7c Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Tue, 29 Sep 2020 15:21:17 +0200 -Subject: [PATCH 159/296] locking/rtmutex: Remove cruft -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Most of this is around since the very beginning. I'm not sure if this -was used while the rtmutex-deadlock-tester was around but today it seems -to only waste memory: -- save_state: No users -- name: Assigned and printed if a dead lock was detected. I'm keeping it - but want to point out that lockdep has the same information. -- file + line: Printed if ::name was NULL. This is only used for - in-kernel locks so it ::name shouldn't be NULL and then ::file and - ::line isn't used. -- magic: Assigned to NULL by rt_mutex_destroy(). - -Remove members of rt_mutex which are not used. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/rtmutex.h | 7 ++----- - kernel/locking/rtmutex-debug.c | 7 +------ - kernel/locking/rtmutex.c | 3 --- - kernel/locking/rtmutex_common.h | 1 - - 4 files changed, 3 insertions(+), 15 deletions(-) - -diff --git a/include/linux/rtmutex.h b/include/linux/rtmutex.h -index 6fd615a0eea9..16f974a22f51 100644 ---- a/include/linux/rtmutex.h -+++ b/include/linux/rtmutex.h -@@ -32,10 +32,7 @@ struct rt_mutex { - struct rb_root_cached waiters; - struct task_struct *owner; - #ifdef CONFIG_DEBUG_RT_MUTEXES -- int save_state; -- const char *name, *file; -- int line; -- void *magic; -+ const char *name; - #endif - #ifdef CONFIG_DEBUG_LOCK_ALLOC - struct lockdep_map dep_map; -@@ -60,7 +57,7 @@ struct hrtimer_sleeper; - - #ifdef CONFIG_DEBUG_RT_MUTEXES - # define __DEBUG_RT_MUTEX_INITIALIZER(mutexname) \ -- , .name = #mutexname, .file = __FILE__, .line = __LINE__ -+ , .name = #mutexname - - # define rt_mutex_init(mutex) \ - do { \ -diff --git a/kernel/locking/rtmutex-debug.c b/kernel/locking/rtmutex-debug.c -index 36e69100e8e0..7e411b946d4c 100644 ---- a/kernel/locking/rtmutex-debug.c -+++ b/kernel/locking/rtmutex-debug.c -@@ -42,12 +42,7 @@ static void printk_task(struct task_struct *p) - - static void printk_lock(struct rt_mutex *lock, int print_owner) - { -- if (lock->name) -- printk(" [%p] {%s}\n", -- lock, lock->name); -- else -- printk(" [%p] {%s:%d}\n", -- lock, lock->file, lock->line); -+ printk(" [%p] {%s}\n", lock, lock->name); - - if (print_owner && rt_mutex_owner(lock)) { - printk(".. ->owner: %p\n", lock->owner); -diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c -index 2f8cd616d3b2..3624e8bbee28 100644 ---- a/kernel/locking/rtmutex.c -+++ b/kernel/locking/rtmutex.c -@@ -1655,9 +1655,6 @@ void __sched rt_mutex_futex_unlock(struct rt_mutex *lock) - void rt_mutex_destroy(struct rt_mutex *lock) - { - WARN_ON(rt_mutex_is_locked(lock)); --#ifdef CONFIG_DEBUG_RT_MUTEXES -- lock->magic = NULL; --#endif - } - EXPORT_SYMBOL_GPL(rt_mutex_destroy); - -diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_common.h -index ca6fb489007b..e6913103d7ff 100644 ---- a/kernel/locking/rtmutex_common.h -+++ b/kernel/locking/rtmutex_common.h -@@ -30,7 +30,6 @@ struct rt_mutex_waiter { - struct task_struct *task; - struct rt_mutex *lock; - #ifdef CONFIG_DEBUG_RT_MUTEXES -- unsigned long ip; - struct pid *deadlock_task_pid; - struct rt_mutex *deadlock_lock; - #endif --- -2.30.2 - diff --git a/debian/patches-rt/0160-locking-rtmutex-Remove-output-from-deadlock-detector.patch b/debian/patches-rt/0160-locking-rtmutex-Remove-output-from-deadlock-detector.patch deleted file mode 100644 index c71694a6f..000000000 --- a/debian/patches-rt/0160-locking-rtmutex-Remove-output-from-deadlock-detector.patch +++ /dev/null @@ -1,312 +0,0 @@ -From 9d60eed599690549839bed0630b16c7ce8fe467d Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Tue, 29 Sep 2020 16:05:11 +0200 -Subject: [PATCH 160/296] locking/rtmutex: Remove output from deadlock - detector. -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -In commit - f5694788ad8da ("rt_mutex: Add lockdep annotations") - -rtmutex gained lockdep annotation for rt_mutex_lock() and and related -functions. -lockdep will see the locking order and may complain about a deadlock -before rtmutex' own mechanism gets a chance to detect it. -The rtmutex deadlock detector will only complain locks with the -RT_MUTEX_MIN_CHAINWALK and a waiter must be pending. That means it -works only for in-kernel locks because the futex interface always uses -RT_MUTEX_FULL_CHAINWALK. -The requirement for an active waiter limits the detector to actual -deadlocks and makes it possible to report potential deadlocks like -lockdep does. -It looks like lockdep is better suited for reporting deadlocks. - -Remove rtmutex' debug print on deadlock detection. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/rtmutex.h | 7 --- - kernel/locking/rtmutex-debug.c | 97 --------------------------------- - kernel/locking/rtmutex-debug.h | 11 ---- - kernel/locking/rtmutex.c | 9 --- - kernel/locking/rtmutex.h | 7 --- - kernel/locking/rtmutex_common.h | 4 -- - 6 files changed, 135 deletions(-) - -diff --git a/include/linux/rtmutex.h b/include/linux/rtmutex.h -index 16f974a22f51..88a0ba806066 100644 ---- a/include/linux/rtmutex.h -+++ b/include/linux/rtmutex.h -@@ -31,9 +31,6 @@ struct rt_mutex { - raw_spinlock_t wait_lock; - struct rb_root_cached waiters; - struct task_struct *owner; --#ifdef CONFIG_DEBUG_RT_MUTEXES -- const char *name; --#endif - #ifdef CONFIG_DEBUG_LOCK_ALLOC - struct lockdep_map dep_map; - #endif -@@ -56,8 +53,6 @@ struct hrtimer_sleeper; - #endif - - #ifdef CONFIG_DEBUG_RT_MUTEXES --# define __DEBUG_RT_MUTEX_INITIALIZER(mutexname) \ -- , .name = #mutexname - - # define rt_mutex_init(mutex) \ - do { \ -@@ -67,7 +62,6 @@ do { \ - - extern void rt_mutex_debug_task_free(struct task_struct *tsk); - #else --# define __DEBUG_RT_MUTEX_INITIALIZER(mutexname) - # define rt_mutex_init(mutex) __rt_mutex_init(mutex, NULL, NULL) - # define rt_mutex_debug_task_free(t) do { } while (0) - #endif -@@ -83,7 +77,6 @@ do { \ - { .wait_lock = __RAW_SPIN_LOCK_UNLOCKED(mutexname.wait_lock) \ - , .waiters = RB_ROOT_CACHED \ - , .owner = NULL \ -- __DEBUG_RT_MUTEX_INITIALIZER(mutexname) \ - __DEP_MAP_RT_MUTEX_INITIALIZER(mutexname)} - - #define DEFINE_RT_MUTEX(mutexname) \ -diff --git a/kernel/locking/rtmutex-debug.c b/kernel/locking/rtmutex-debug.c -index 7e411b946d4c..fb150100335f 100644 ---- a/kernel/locking/rtmutex-debug.c -+++ b/kernel/locking/rtmutex-debug.c -@@ -32,105 +32,12 @@ - - #include "rtmutex_common.h" - --static void printk_task(struct task_struct *p) --{ -- if (p) -- printk("%16s:%5d [%p, %3d]", p->comm, task_pid_nr(p), p, p->prio); -- else -- printk("<none>"); --} -- --static void printk_lock(struct rt_mutex *lock, int print_owner) --{ -- printk(" [%p] {%s}\n", lock, lock->name); -- -- if (print_owner && rt_mutex_owner(lock)) { -- printk(".. ->owner: %p\n", lock->owner); -- printk(".. held by: "); -- printk_task(rt_mutex_owner(lock)); -- printk("\n"); -- } --} -- - void rt_mutex_debug_task_free(struct task_struct *task) - { - DEBUG_LOCKS_WARN_ON(!RB_EMPTY_ROOT(&task->pi_waiters.rb_root)); - DEBUG_LOCKS_WARN_ON(task->pi_blocked_on); - } - --/* -- * We fill out the fields in the waiter to store the information about -- * the deadlock. We print when we return. act_waiter can be NULL in -- * case of a remove waiter operation. -- */ --void debug_rt_mutex_deadlock(enum rtmutex_chainwalk chwalk, -- struct rt_mutex_waiter *act_waiter, -- struct rt_mutex *lock) --{ -- struct task_struct *task; -- -- if (!debug_locks || chwalk == RT_MUTEX_FULL_CHAINWALK || !act_waiter) -- return; -- -- task = rt_mutex_owner(act_waiter->lock); -- if (task && task != current) { -- act_waiter->deadlock_task_pid = get_pid(task_pid(task)); -- act_waiter->deadlock_lock = lock; -- } --} -- --void debug_rt_mutex_print_deadlock(struct rt_mutex_waiter *waiter) --{ -- struct task_struct *task; -- -- if (!waiter->deadlock_lock || !debug_locks) -- return; -- -- rcu_read_lock(); -- task = pid_task(waiter->deadlock_task_pid, PIDTYPE_PID); -- if (!task) { -- rcu_read_unlock(); -- return; -- } -- -- if (!debug_locks_off()) { -- rcu_read_unlock(); -- return; -- } -- -- pr_warn("\n"); -- pr_warn("============================================\n"); -- pr_warn("WARNING: circular locking deadlock detected!\n"); -- pr_warn("%s\n", print_tainted()); -- pr_warn("--------------------------------------------\n"); -- printk("%s/%d is deadlocking current task %s/%d\n\n", -- task->comm, task_pid_nr(task), -- current->comm, task_pid_nr(current)); -- -- printk("\n1) %s/%d is trying to acquire this lock:\n", -- current->comm, task_pid_nr(current)); -- printk_lock(waiter->lock, 1); -- -- printk("\n2) %s/%d is blocked on this lock:\n", -- task->comm, task_pid_nr(task)); -- printk_lock(waiter->deadlock_lock, 1); -- -- debug_show_held_locks(current); -- debug_show_held_locks(task); -- -- printk("\n%s/%d's [blocked] stackdump:\n\n", -- task->comm, task_pid_nr(task)); -- show_stack(task, NULL, KERN_DEFAULT); -- printk("\n%s/%d's [current] stackdump:\n\n", -- current->comm, task_pid_nr(current)); -- dump_stack(); -- debug_show_all_locks(); -- rcu_read_unlock(); -- -- printk("[ turning off deadlock detection." -- "Please report this trace. ]\n\n"); --} -- - void debug_rt_mutex_lock(struct rt_mutex *lock) - { - } -@@ -153,12 +60,10 @@ void debug_rt_mutex_proxy_unlock(struct rt_mutex *lock) - void debug_rt_mutex_init_waiter(struct rt_mutex_waiter *waiter) - { - memset(waiter, 0x11, sizeof(*waiter)); -- waiter->deadlock_task_pid = NULL; - } - - void debug_rt_mutex_free_waiter(struct rt_mutex_waiter *waiter) - { -- put_pid(waiter->deadlock_task_pid); - memset(waiter, 0x22, sizeof(*waiter)); - } - -@@ -168,10 +73,8 @@ void debug_rt_mutex_init(struct rt_mutex *lock, const char *name, struct lock_cl - * Make sure we are not reinitializing a held lock: - */ - debug_check_no_locks_freed((void *)lock, sizeof(*lock)); -- lock->name = name; - - #ifdef CONFIG_DEBUG_LOCK_ALLOC - lockdep_init_map(&lock->dep_map, name, key, 0); - #endif - } -- -diff --git a/kernel/locking/rtmutex-debug.h b/kernel/locking/rtmutex-debug.h -index fc549713bba3..659e93e256c6 100644 ---- a/kernel/locking/rtmutex-debug.h -+++ b/kernel/locking/rtmutex-debug.h -@@ -18,20 +18,9 @@ extern void debug_rt_mutex_unlock(struct rt_mutex *lock); - extern void debug_rt_mutex_proxy_lock(struct rt_mutex *lock, - struct task_struct *powner); - extern void debug_rt_mutex_proxy_unlock(struct rt_mutex *lock); --extern void debug_rt_mutex_deadlock(enum rtmutex_chainwalk chwalk, -- struct rt_mutex_waiter *waiter, -- struct rt_mutex *lock); --extern void debug_rt_mutex_print_deadlock(struct rt_mutex_waiter *waiter); --# define debug_rt_mutex_reset_waiter(w) \ -- do { (w)->deadlock_lock = NULL; } while (0) - - static inline bool debug_rt_mutex_detect_deadlock(struct rt_mutex_waiter *waiter, - enum rtmutex_chainwalk walk) - { - return (waiter != NULL); - } -- --static inline void rt_mutex_print_deadlock(struct rt_mutex_waiter *w) --{ -- debug_rt_mutex_print_deadlock(w); --} -diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c -index 3624e8bbee28..d2a90977ea7c 100644 ---- a/kernel/locking/rtmutex.c -+++ b/kernel/locking/rtmutex.c -@@ -597,7 +597,6 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task, - * walk, we detected a deadlock. - */ - if (lock == orig_lock || rt_mutex_owner(lock) == top_task) { -- debug_rt_mutex_deadlock(chwalk, orig_waiter, lock); - raw_spin_unlock(&lock->wait_lock); - ret = -EDEADLK; - goto out_unlock_pi; -@@ -1189,8 +1188,6 @@ __rt_mutex_slowlock(struct rt_mutex *lock, int state, - - raw_spin_unlock_irq(&lock->wait_lock); - -- debug_rt_mutex_print_deadlock(waiter); -- - schedule(); - - raw_spin_lock_irq(&lock->wait_lock); -@@ -1211,10 +1208,6 @@ static void rt_mutex_handle_deadlock(int res, int detect_deadlock, - if (res != -EDEADLOCK || detect_deadlock) - return; - -- /* -- * Yell lowdly and stop the task right here. -- */ -- rt_mutex_print_deadlock(w); - while (1) { - set_current_state(TASK_INTERRUPTIBLE); - schedule(); -@@ -1763,8 +1756,6 @@ int __rt_mutex_start_proxy_lock(struct rt_mutex *lock, - ret = 0; - } - -- debug_rt_mutex_print_deadlock(waiter); -- - return ret; - } - -diff --git a/kernel/locking/rtmutex.h b/kernel/locking/rtmutex.h -index 732f96abf462..338ccd29119a 100644 ---- a/kernel/locking/rtmutex.h -+++ b/kernel/locking/rtmutex.h -@@ -19,15 +19,8 @@ - #define debug_rt_mutex_proxy_unlock(l) do { } while (0) - #define debug_rt_mutex_unlock(l) do { } while (0) - #define debug_rt_mutex_init(m, n, k) do { } while (0) --#define debug_rt_mutex_deadlock(d, a ,l) do { } while (0) --#define debug_rt_mutex_print_deadlock(w) do { } while (0) - #define debug_rt_mutex_reset_waiter(w) do { } while (0) - --static inline void rt_mutex_print_deadlock(struct rt_mutex_waiter *w) --{ -- WARN(1, "rtmutex deadlock detected\n"); --} -- - static inline bool debug_rt_mutex_detect_deadlock(struct rt_mutex_waiter *w, - enum rtmutex_chainwalk walk) - { -diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_common.h -index e6913103d7ff..b1455dc2366f 100644 ---- a/kernel/locking/rtmutex_common.h -+++ b/kernel/locking/rtmutex_common.h -@@ -29,10 +29,6 @@ struct rt_mutex_waiter { - struct rb_node pi_tree_entry; - struct task_struct *task; - struct rt_mutex *lock; --#ifdef CONFIG_DEBUG_RT_MUTEXES -- struct pid *deadlock_task_pid; -- struct rt_mutex *deadlock_lock; --#endif - int prio; - u64 deadline; - }; --- -2.30.2 - diff --git a/debian/patches-rt/0161-locking-rtmutex-Move-rt_mutex_init-outside-of-CONFIG.patch b/debian/patches-rt/0161-locking-rtmutex-Move-rt_mutex_init-outside-of-CONFIG.patch deleted file mode 100644 index d623eb369..000000000 --- a/debian/patches-rt/0161-locking-rtmutex-Move-rt_mutex_init-outside-of-CONFIG.patch +++ /dev/null @@ -1,60 +0,0 @@ -From 467565a3950497d85595e22d3292bb029eaf76e2 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Tue, 29 Sep 2020 16:32:49 +0200 -Subject: [PATCH 161/296] locking/rtmutex: Move rt_mutex_init() outside of - CONFIG_DEBUG_RT_MUTEXES -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -rt_mutex_init() only initializes lockdep if CONFIG_DEBUG_RT_MUTEXES is -enabled. The static initializer (DEFINE_RT_MUTEX) does not have such a -restriction. - -Move rt_mutex_init() outside of CONFIG_DEBUG_RT_MUTEXES. -Move the remaining functions in this CONFIG_DEBUG_RT_MUTEXES block to -the upper block. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/rtmutex.h | 12 +++--------- - 1 file changed, 3 insertions(+), 9 deletions(-) - -diff --git a/include/linux/rtmutex.h b/include/linux/rtmutex.h -index 88a0ba806066..2dc10b582d4a 100644 ---- a/include/linux/rtmutex.h -+++ b/include/linux/rtmutex.h -@@ -43,6 +43,7 @@ struct hrtimer_sleeper; - extern int rt_mutex_debug_check_no_locks_freed(const void *from, - unsigned long len); - extern void rt_mutex_debug_check_no_locks_held(struct task_struct *task); -+ extern void rt_mutex_debug_task_free(struct task_struct *tsk); - #else - static inline int rt_mutex_debug_check_no_locks_freed(const void *from, - unsigned long len) -@@ -50,22 +51,15 @@ struct hrtimer_sleeper; - return 0; - } - # define rt_mutex_debug_check_no_locks_held(task) do { } while (0) -+# define rt_mutex_debug_task_free(t) do { } while (0) - #endif - --#ifdef CONFIG_DEBUG_RT_MUTEXES -- --# define rt_mutex_init(mutex) \ -+#define rt_mutex_init(mutex) \ - do { \ - static struct lock_class_key __key; \ - __rt_mutex_init(mutex, __func__, &__key); \ - } while (0) - -- extern void rt_mutex_debug_task_free(struct task_struct *tsk); --#else --# define rt_mutex_init(mutex) __rt_mutex_init(mutex, NULL, NULL) --# define rt_mutex_debug_task_free(t) do { } while (0) --#endif -- - #ifdef CONFIG_DEBUG_LOCK_ALLOC - #define __DEP_MAP_RT_MUTEX_INITIALIZER(mutexname) \ - , .dep_map = { .name = #mutexname } --- -2.30.2 - diff --git a/debian/patches-rt/0162-locking-rtmutex-Remove-rt_mutex_timed_lock.patch b/debian/patches-rt/0162-locking-rtmutex-Remove-rt_mutex_timed_lock.patch deleted file mode 100644 index 135054f44..000000000 --- a/debian/patches-rt/0162-locking-rtmutex-Remove-rt_mutex_timed_lock.patch +++ /dev/null @@ -1,98 +0,0 @@ -From 9bed15bc5982cdfec0f452d67e53008cc4c67b02 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Wed, 7 Oct 2020 12:11:33 +0200 -Subject: [PATCH 162/296] locking/rtmutex: Remove rt_mutex_timed_lock() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -rt_mutex_timed_lock() has no callers since commit - c051b21f71d1f ("rtmutex: Confine deadlock logic to futex") - -Remove rt_mutex_timed_lock(). - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/rtmutex.h | 3 --- - kernel/locking/rtmutex.c | 46 ---------------------------------------- - 2 files changed, 49 deletions(-) - -diff --git a/include/linux/rtmutex.h b/include/linux/rtmutex.h -index 2dc10b582d4a..243fabc2c85f 100644 ---- a/include/linux/rtmutex.h -+++ b/include/linux/rtmutex.h -@@ -99,9 +99,6 @@ extern void rt_mutex_lock(struct rt_mutex *lock); - #endif - - extern int rt_mutex_lock_interruptible(struct rt_mutex *lock); --extern int rt_mutex_timed_lock(struct rt_mutex *lock, -- struct hrtimer_sleeper *timeout); -- - extern int rt_mutex_trylock(struct rt_mutex *lock); - - extern void rt_mutex_unlock(struct rt_mutex *lock); -diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c -index d2a90977ea7c..1d0e9bf0487a 100644 ---- a/kernel/locking/rtmutex.c -+++ b/kernel/locking/rtmutex.c -@@ -1405,21 +1405,6 @@ rt_mutex_fastlock(struct rt_mutex *lock, int state, - return slowfn(lock, state, NULL, RT_MUTEX_MIN_CHAINWALK); - } - --static inline int --rt_mutex_timed_fastlock(struct rt_mutex *lock, int state, -- struct hrtimer_sleeper *timeout, -- enum rtmutex_chainwalk chwalk, -- int (*slowfn)(struct rt_mutex *lock, int state, -- struct hrtimer_sleeper *timeout, -- enum rtmutex_chainwalk chwalk)) --{ -- if (chwalk == RT_MUTEX_MIN_CHAINWALK && -- likely(rt_mutex_cmpxchg_acquire(lock, NULL, current))) -- return 0; -- -- return slowfn(lock, state, timeout, chwalk); --} -- - static inline int - rt_mutex_fasttrylock(struct rt_mutex *lock, - int (*slowfn)(struct rt_mutex *lock)) -@@ -1527,37 +1512,6 @@ int __sched __rt_mutex_futex_trylock(struct rt_mutex *lock) - return __rt_mutex_slowtrylock(lock); - } - --/** -- * rt_mutex_timed_lock - lock a rt_mutex interruptible -- * the timeout structure is provided -- * by the caller -- * -- * @lock: the rt_mutex to be locked -- * @timeout: timeout structure or NULL (no timeout) -- * -- * Returns: -- * 0 on success -- * -EINTR when interrupted by a signal -- * -ETIMEDOUT when the timeout expired -- */ --int --rt_mutex_timed_lock(struct rt_mutex *lock, struct hrtimer_sleeper *timeout) --{ -- int ret; -- -- might_sleep(); -- -- mutex_acquire(&lock->dep_map, 0, 0, _RET_IP_); -- ret = rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout, -- RT_MUTEX_MIN_CHAINWALK, -- rt_mutex_slowlock); -- if (ret) -- mutex_release(&lock->dep_map, _RET_IP_); -- -- return ret; --} --EXPORT_SYMBOL_GPL(rt_mutex_timed_lock); -- - /** - * rt_mutex_trylock - try to lock a rt_mutex - * --- -2.30.2 - diff --git a/debian/patches-rt/0163-locking-rtmutex-Handle-the-various-new-futex-race-co.patch b/debian/patches-rt/0163-locking-rtmutex-Handle-the-various-new-futex-race-co.patch deleted file mode 100644 index e5f77218c..000000000 --- a/debian/patches-rt/0163-locking-rtmutex-Handle-the-various-new-futex-race-co.patch +++ /dev/null @@ -1,255 +0,0 @@ -From 9ea37f897289f3d56ce560a606b2e9e8ce605687 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Fri, 10 Jun 2011 11:04:15 +0200 -Subject: [PATCH 163/296] locking/rtmutex: Handle the various new futex race - conditions -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -RT opens a few new interesting race conditions in the rtmutex/futex -combo due to futex hash bucket lock being a 'sleeping' spinlock and -therefor not disabling preemption. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - kernel/futex.c | 77 ++++++++++++++++++++++++++------- - kernel/locking/rtmutex.c | 36 ++++++++++++--- - kernel/locking/rtmutex_common.h | 2 + - 3 files changed, 94 insertions(+), 21 deletions(-) - -diff --git a/kernel/futex.c b/kernel/futex.c -index 7cf1987cfdb4..6d7aaa8fe439 100644 ---- a/kernel/futex.c -+++ b/kernel/futex.c -@@ -2156,6 +2156,16 @@ static int futex_requeue(u32 __user *uaddr1, unsigned int flags, - */ - requeue_pi_wake_futex(this, &key2, hb2); - continue; -+ } else if (ret == -EAGAIN) { -+ /* -+ * Waiter was woken by timeout or -+ * signal and has set pi_blocked_on to -+ * PI_WAKEUP_INPROGRESS before we -+ * tried to enqueue it on the rtmutex. -+ */ -+ this->pi_state = NULL; -+ put_pi_state(pi_state); -+ continue; - } else if (ret) { - /* - * rt_mutex_start_proxy_lock() detected a -@@ -3173,7 +3183,7 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags, - { - struct hrtimer_sleeper timeout, *to; - struct rt_mutex_waiter rt_waiter; -- struct futex_hash_bucket *hb; -+ struct futex_hash_bucket *hb, *hb2; - union futex_key key2 = FUTEX_KEY_INIT; - struct futex_q q = futex_q_init; - int res, ret; -@@ -3225,20 +3235,55 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags, - /* Queue the futex_q, drop the hb lock, wait for wakeup. */ - futex_wait_queue_me(hb, &q, to); - -- spin_lock(&hb->lock); -- ret = handle_early_requeue_pi_wakeup(hb, &q, &key2, to); -- spin_unlock(&hb->lock); -- if (ret) -- goto out; -+ /* -+ * On RT we must avoid races with requeue and trying to block -+ * on two mutexes (hb->lock and uaddr2's rtmutex) by -+ * serializing access to pi_blocked_on with pi_lock. -+ */ -+ raw_spin_lock_irq(¤t->pi_lock); -+ if (current->pi_blocked_on) { -+ /* -+ * We have been requeued or are in the process of -+ * being requeued. -+ */ -+ raw_spin_unlock_irq(¤t->pi_lock); -+ } else { -+ /* -+ * Setting pi_blocked_on to PI_WAKEUP_INPROGRESS -+ * prevents a concurrent requeue from moving us to the -+ * uaddr2 rtmutex. After that we can safely acquire -+ * (and possibly block on) hb->lock. -+ */ -+ current->pi_blocked_on = PI_WAKEUP_INPROGRESS; -+ raw_spin_unlock_irq(¤t->pi_lock); -+ -+ spin_lock(&hb->lock); -+ -+ /* -+ * Clean up pi_blocked_on. We might leak it otherwise -+ * when we succeeded with the hb->lock in the fast -+ * path. -+ */ -+ raw_spin_lock_irq(¤t->pi_lock); -+ current->pi_blocked_on = NULL; -+ raw_spin_unlock_irq(¤t->pi_lock); -+ -+ ret = handle_early_requeue_pi_wakeup(hb, &q, &key2, to); -+ spin_unlock(&hb->lock); -+ if (ret) -+ goto out; -+ } - - /* -- * In order for us to be here, we know our q.key == key2, and since -- * we took the hb->lock above, we also know that futex_requeue() has -- * completed and we no longer have to concern ourselves with a wakeup -- * race with the atomic proxy lock acquisition by the requeue code. The -- * futex_requeue dropped our key1 reference and incremented our key2 -- * reference count. -+ * In order to be here, we have either been requeued, are in -+ * the process of being requeued, or requeue successfully -+ * acquired uaddr2 on our behalf. If pi_blocked_on was -+ * non-null above, we may be racing with a requeue. Do not -+ * rely on q->lock_ptr to be hb2->lock until after blocking on -+ * hb->lock or hb2->lock. The futex_requeue dropped our key1 -+ * reference and incremented our key2 reference count. - */ -+ hb2 = hash_futex(&key2); - - /* Check if the requeue code acquired the second futex for us. */ - if (!q.rt_waiter) { -@@ -3247,14 +3292,15 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags, - * did a lock-steal - fix up the PI-state in that case. - */ - if (q.pi_state && (q.pi_state->owner != current)) { -- spin_lock(q.lock_ptr); -+ spin_lock(&hb2->lock); -+ BUG_ON(&hb2->lock != q.lock_ptr); - ret = fixup_pi_state_owner(uaddr2, &q, current); - /* - * Drop the reference to the pi state which - * the requeue_pi() code acquired for us. - */ - put_pi_state(q.pi_state); -- spin_unlock(q.lock_ptr); -+ spin_unlock(&hb2->lock); - /* - * Adjust the return value. It's either -EFAULT or - * success (1) but the caller expects 0 for success. -@@ -3273,7 +3319,8 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags, - pi_mutex = &q.pi_state->pi_mutex; - ret = rt_mutex_wait_proxy_lock(pi_mutex, to, &rt_waiter); - -- spin_lock(q.lock_ptr); -+ spin_lock(&hb2->lock); -+ BUG_ON(&hb2->lock != q.lock_ptr); - if (ret && !rt_mutex_cleanup_proxy_lock(pi_mutex, &rt_waiter)) - ret = 0; - -diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c -index 1d0e9bf0487a..97a5fb19119d 100644 ---- a/kernel/locking/rtmutex.c -+++ b/kernel/locking/rtmutex.c -@@ -136,6 +136,11 @@ static void fixup_rt_mutex_waiters(struct rt_mutex *lock) - WRITE_ONCE(*p, owner & ~RT_MUTEX_HAS_WAITERS); - } - -+static int rt_mutex_real_waiter(struct rt_mutex_waiter *waiter) -+{ -+ return waiter && waiter != PI_WAKEUP_INPROGRESS; -+} -+ - /* - * We can speed up the acquire/release, if there's no debugging state to be - * set up. -@@ -378,7 +383,8 @@ int max_lock_depth = 1024; - - static inline struct rt_mutex *task_blocked_on_lock(struct task_struct *p) - { -- return p->pi_blocked_on ? p->pi_blocked_on->lock : NULL; -+ return rt_mutex_real_waiter(p->pi_blocked_on) ? -+ p->pi_blocked_on->lock : NULL; - } - - /* -@@ -514,7 +520,7 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task, - * reached or the state of the chain has changed while we - * dropped the locks. - */ -- if (!waiter) -+ if (!rt_mutex_real_waiter(waiter)) - goto out_unlock_pi; - - /* -@@ -947,6 +953,22 @@ static int task_blocks_on_rt_mutex(struct rt_mutex *lock, - return -EDEADLK; - - raw_spin_lock(&task->pi_lock); -+ /* -+ * In the case of futex requeue PI, this will be a proxy -+ * lock. The task will wake unaware that it is enqueueed on -+ * this lock. Avoid blocking on two locks and corrupting -+ * pi_blocked_on via the PI_WAKEUP_INPROGRESS -+ * flag. futex_wait_requeue_pi() sets this when it wakes up -+ * before requeue (due to a signal or timeout). Do not enqueue -+ * the task if PI_WAKEUP_INPROGRESS is set. -+ */ -+ if (task != current && task->pi_blocked_on == PI_WAKEUP_INPROGRESS) { -+ raw_spin_unlock(&task->pi_lock); -+ return -EAGAIN; -+ } -+ -+ BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on)); -+ - waiter->task = task; - waiter->lock = lock; - waiter->prio = task->prio; -@@ -970,7 +992,7 @@ static int task_blocks_on_rt_mutex(struct rt_mutex *lock, - rt_mutex_enqueue_pi(owner, waiter); - - rt_mutex_adjust_prio(owner); -- if (owner->pi_blocked_on) -+ if (rt_mutex_real_waiter(owner->pi_blocked_on)) - chain_walk = 1; - } else if (rt_mutex_cond_detect_deadlock(waiter, chwalk)) { - chain_walk = 1; -@@ -1066,7 +1088,7 @@ static void remove_waiter(struct rt_mutex *lock, - { - bool is_top_waiter = (waiter == rt_mutex_top_waiter(lock)); - struct task_struct *owner = rt_mutex_owner(lock); -- struct rt_mutex *next_lock; -+ struct rt_mutex *next_lock = NULL; - - lockdep_assert_held(&lock->wait_lock); - -@@ -1092,7 +1114,8 @@ static void remove_waiter(struct rt_mutex *lock, - rt_mutex_adjust_prio(owner); - - /* Store the lock on which owner is blocked or NULL */ -- next_lock = task_blocked_on_lock(owner); -+ if (rt_mutex_real_waiter(owner->pi_blocked_on)) -+ next_lock = task_blocked_on_lock(owner); - - raw_spin_unlock(&owner->pi_lock); - -@@ -1128,7 +1151,8 @@ void rt_mutex_adjust_pi(struct task_struct *task) - raw_spin_lock_irqsave(&task->pi_lock, flags); - - waiter = task->pi_blocked_on; -- if (!waiter || rt_mutex_waiter_equal(waiter, task_to_waiter(task))) { -+ if (!rt_mutex_real_waiter(waiter) || -+ rt_mutex_waiter_equal(waiter, task_to_waiter(task))) { - raw_spin_unlock_irqrestore(&task->pi_lock, flags); - return; - } -diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_common.h -index b1455dc2366f..096b16cfb096 100644 ---- a/kernel/locking/rtmutex_common.h -+++ b/kernel/locking/rtmutex_common.h -@@ -125,6 +125,8 @@ enum rtmutex_chainwalk { - /* - * PI-futex support (proxy locking functions, etc.): - */ -+#define PI_WAKEUP_INPROGRESS ((struct rt_mutex_waiter *) 1) -+ - extern struct task_struct *rt_mutex_next_owner(struct rt_mutex *lock); - extern void rt_mutex_init_proxy_locked(struct rt_mutex *lock, - struct task_struct *proxy_owner); --- -2.30.2 - diff --git a/debian/patches-rt/0164-futex-Fix-bug-on-when-a-requeued-RT-task-times-out.patch b/debian/patches-rt/0164-futex-Fix-bug-on-when-a-requeued-RT-task-times-out.patch deleted file mode 100644 index eb9b73099..000000000 --- a/debian/patches-rt/0164-futex-Fix-bug-on-when-a-requeued-RT-task-times-out.patch +++ /dev/null @@ -1,118 +0,0 @@ -From 4ff24e66c2a9d3d0341c58c3af2983a9e18379dd Mon Sep 17 00:00:00 2001 -From: Steven Rostedt <rostedt@goodmis.org> -Date: Tue, 14 Jul 2015 14:26:34 +0200 -Subject: [PATCH 164/296] futex: Fix bug on when a requeued RT task times out -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Requeue with timeout causes a bug with PREEMPT_RT. - -The bug comes from a timed out condition. - - TASK 1 TASK 2 - ------ ------ - futex_wait_requeue_pi() - futex_wait_queue_me() - <timed out> - - double_lock_hb(); - - raw_spin_lock(pi_lock); - if (current->pi_blocked_on) { - } else { - current->pi_blocked_on = PI_WAKE_INPROGRESS; - run_spin_unlock(pi_lock); - spin_lock(hb->lock); <-- blocked! - - plist_for_each_entry_safe(this) { - rt_mutex_start_proxy_lock(); - task_blocks_on_rt_mutex(); - BUG_ON(task->pi_blocked_on)!!!! - -The BUG_ON() actually has a check for PI_WAKE_INPROGRESS, but the -problem is that, after TASK 1 sets PI_WAKE_INPROGRESS, it then tries to -grab the hb->lock, which it fails to do so. As the hb->lock is a mutex, -it will block and set the "pi_blocked_on" to the hb->lock. - -When TASK 2 goes to requeue it, the check for PI_WAKE_INPROGESS fails -because the task1's pi_blocked_on is no longer set to that, but instead, -set to the hb->lock. - -The fix: - -When calling rt_mutex_start_proxy_lock() a check is made to see -if the proxy tasks pi_blocked_on is set. If so, exit out early. -Otherwise set it to a new flag PI_REQUEUE_INPROGRESS, which notifies -the proxy task that it is being requeued, and will handle things -appropriately. - -Signed-off-by: Steven Rostedt <rostedt@goodmis.org> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - kernel/locking/rtmutex.c | 31 ++++++++++++++++++++++++++++++- - kernel/locking/rtmutex_common.h | 1 + - 2 files changed, 31 insertions(+), 1 deletion(-) - -diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c -index 97a5fb19119d..17718a5ebdb2 100644 ---- a/kernel/locking/rtmutex.c -+++ b/kernel/locking/rtmutex.c -@@ -138,7 +138,8 @@ static void fixup_rt_mutex_waiters(struct rt_mutex *lock) - - static int rt_mutex_real_waiter(struct rt_mutex_waiter *waiter) - { -- return waiter && waiter != PI_WAKEUP_INPROGRESS; -+ return waiter && waiter != PI_WAKEUP_INPROGRESS && -+ waiter != PI_REQUEUE_INPROGRESS; - } - - /* -@@ -1720,6 +1721,34 @@ int __rt_mutex_start_proxy_lock(struct rt_mutex *lock, - if (try_to_take_rt_mutex(lock, task, NULL)) - return 1; - -+#ifdef CONFIG_PREEMPT_RT -+ /* -+ * In PREEMPT_RT there's an added race. -+ * If the task, that we are about to requeue, times out, -+ * it can set the PI_WAKEUP_INPROGRESS. This tells the requeue -+ * to skip this task. But right after the task sets -+ * its pi_blocked_on to PI_WAKEUP_INPROGRESS it can then -+ * block on the spin_lock(&hb->lock), which in RT is an rtmutex. -+ * This will replace the PI_WAKEUP_INPROGRESS with the actual -+ * lock that it blocks on. We *must not* place this task -+ * on this proxy lock in that case. -+ * -+ * To prevent this race, we first take the task's pi_lock -+ * and check if it has updated its pi_blocked_on. If it has, -+ * we assume that it woke up and we return -EAGAIN. -+ * Otherwise, we set the task's pi_blocked_on to -+ * PI_REQUEUE_INPROGRESS, so that if the task is waking up -+ * it will know that we are in the process of requeuing it. -+ */ -+ raw_spin_lock(&task->pi_lock); -+ if (task->pi_blocked_on) { -+ raw_spin_unlock(&task->pi_lock); -+ return -EAGAIN; -+ } -+ task->pi_blocked_on = PI_REQUEUE_INPROGRESS; -+ raw_spin_unlock(&task->pi_lock); -+#endif -+ - /* We enforce deadlock detection for futexes */ - ret = task_blocks_on_rt_mutex(lock, waiter, task, - RT_MUTEX_FULL_CHAINWALK); -diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_common.h -index 096b16cfb096..37cd6b3bf6f4 100644 ---- a/kernel/locking/rtmutex_common.h -+++ b/kernel/locking/rtmutex_common.h -@@ -126,6 +126,7 @@ enum rtmutex_chainwalk { - * PI-futex support (proxy locking functions, etc.): - */ - #define PI_WAKEUP_INPROGRESS ((struct rt_mutex_waiter *) 1) -+#define PI_REQUEUE_INPROGRESS ((struct rt_mutex_waiter *) 2) - - extern struct task_struct *rt_mutex_next_owner(struct rt_mutex *lock); - extern void rt_mutex_init_proxy_locked(struct rt_mutex *lock, --- -2.30.2 - diff --git a/debian/patches-rt/0165-locking-rtmutex-Make-lock_killable-work.patch b/debian/patches-rt/0165-locking-rtmutex-Make-lock_killable-work.patch deleted file mode 100644 index 65da07794..000000000 --- a/debian/patches-rt/0165-locking-rtmutex-Make-lock_killable-work.patch +++ /dev/null @@ -1,50 +0,0 @@ -From cec310e01b057429212c66d7d8cfaaaafbfdfa6b Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Sat, 1 Apr 2017 12:50:59 +0200 -Subject: [PATCH 165/296] locking/rtmutex: Make lock_killable work -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Locking an rt mutex killable does not work because signal handling is -restricted to TASK_INTERRUPTIBLE. - -Use signal_pending_state() unconditionally. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/locking/rtmutex.c | 19 +++++++------------ - 1 file changed, 7 insertions(+), 12 deletions(-) - -diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c -index 17718a5ebdb2..5148a2b49c55 100644 ---- a/kernel/locking/rtmutex.c -+++ b/kernel/locking/rtmutex.c -@@ -1197,18 +1197,13 @@ __rt_mutex_slowlock(struct rt_mutex *lock, int state, - if (try_to_take_rt_mutex(lock, current, waiter)) - break; - -- /* -- * TASK_INTERRUPTIBLE checks for signals and -- * timeout. Ignored otherwise. -- */ -- if (likely(state == TASK_INTERRUPTIBLE)) { -- /* Signal pending? */ -- if (signal_pending(current)) -- ret = -EINTR; -- if (timeout && !timeout->task) -- ret = -ETIMEDOUT; -- if (ret) -- break; -+ if (timeout && !timeout->task) { -+ ret = -ETIMEDOUT; -+ break; -+ } -+ if (signal_pending_state(state, current)) { -+ ret = -EINTR; -+ break; - } - - raw_spin_unlock_irq(&lock->wait_lock); --- -2.30.2 - diff --git a/debian/patches-rt/0166-locking-spinlock-Split-the-lock-types-header.patch b/debian/patches-rt/0166-locking-spinlock-Split-the-lock-types-header.patch deleted file mode 100644 index 72a1fe82b..000000000 --- a/debian/patches-rt/0166-locking-spinlock-Split-the-lock-types-header.patch +++ /dev/null @@ -1,253 +0,0 @@ -From b9cd8312e1f038a7d41f52ecdfda6ad05c65cf69 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Wed, 29 Jun 2011 19:34:01 +0200 -Subject: [PATCH 166/296] locking/spinlock: Split the lock types header -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Split raw_spinlock into its own file and the remaining spinlock_t into -its own non-RT header. The non-RT header will be replaced later by sleeping -spinlocks. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - include/linux/rwlock_types.h | 4 ++ - include/linux/spinlock_types.h | 87 +---------------------------- - include/linux/spinlock_types_nort.h | 39 +++++++++++++ - include/linux/spinlock_types_raw.h | 65 +++++++++++++++++++++ - 4 files changed, 110 insertions(+), 85 deletions(-) - create mode 100644 include/linux/spinlock_types_nort.h - create mode 100644 include/linux/spinlock_types_raw.h - -diff --git a/include/linux/rwlock_types.h b/include/linux/rwlock_types.h -index 3bd03e18061c..0ad226b5d8fd 100644 ---- a/include/linux/rwlock_types.h -+++ b/include/linux/rwlock_types.h -@@ -1,6 +1,10 @@ - #ifndef __LINUX_RWLOCK_TYPES_H - #define __LINUX_RWLOCK_TYPES_H - -+#if !defined(__LINUX_SPINLOCK_TYPES_H) -+# error "Do not include directly, include spinlock_types.h" -+#endif -+ - /* - * include/linux/rwlock_types.h - generic rwlock type definitions - * and initializers -diff --git a/include/linux/spinlock_types.h b/include/linux/spinlock_types.h -index b981caafe8bf..5c8664d57fb8 100644 ---- a/include/linux/spinlock_types.h -+++ b/include/linux/spinlock_types.h -@@ -9,92 +9,9 @@ - * Released under the General Public License (GPL). - */ - --#if defined(CONFIG_SMP) --# include <asm/spinlock_types.h> --#else --# include <linux/spinlock_types_up.h> --#endif -+#include <linux/spinlock_types_raw.h> - --#include <linux/lockdep_types.h> -- --typedef struct raw_spinlock { -- arch_spinlock_t raw_lock; --#ifdef CONFIG_DEBUG_SPINLOCK -- unsigned int magic, owner_cpu; -- void *owner; --#endif --#ifdef CONFIG_DEBUG_LOCK_ALLOC -- struct lockdep_map dep_map; --#endif --} raw_spinlock_t; -- --#define SPINLOCK_MAGIC 0xdead4ead -- --#define SPINLOCK_OWNER_INIT ((void *)-1L) -- --#ifdef CONFIG_DEBUG_LOCK_ALLOC --# define RAW_SPIN_DEP_MAP_INIT(lockname) \ -- .dep_map = { \ -- .name = #lockname, \ -- .wait_type_inner = LD_WAIT_SPIN, \ -- } --# define SPIN_DEP_MAP_INIT(lockname) \ -- .dep_map = { \ -- .name = #lockname, \ -- .wait_type_inner = LD_WAIT_CONFIG, \ -- } --#else --# define RAW_SPIN_DEP_MAP_INIT(lockname) --# define SPIN_DEP_MAP_INIT(lockname) --#endif -- --#ifdef CONFIG_DEBUG_SPINLOCK --# define SPIN_DEBUG_INIT(lockname) \ -- .magic = SPINLOCK_MAGIC, \ -- .owner_cpu = -1, \ -- .owner = SPINLOCK_OWNER_INIT, --#else --# define SPIN_DEBUG_INIT(lockname) --#endif -- --#define __RAW_SPIN_LOCK_INITIALIZER(lockname) \ -- { \ -- .raw_lock = __ARCH_SPIN_LOCK_UNLOCKED, \ -- SPIN_DEBUG_INIT(lockname) \ -- RAW_SPIN_DEP_MAP_INIT(lockname) } -- --#define __RAW_SPIN_LOCK_UNLOCKED(lockname) \ -- (raw_spinlock_t) __RAW_SPIN_LOCK_INITIALIZER(lockname) -- --#define DEFINE_RAW_SPINLOCK(x) raw_spinlock_t x = __RAW_SPIN_LOCK_UNLOCKED(x) -- --typedef struct spinlock { -- union { -- struct raw_spinlock rlock; -- --#ifdef CONFIG_DEBUG_LOCK_ALLOC --# define LOCK_PADSIZE (offsetof(struct raw_spinlock, dep_map)) -- struct { -- u8 __padding[LOCK_PADSIZE]; -- struct lockdep_map dep_map; -- }; --#endif -- }; --} spinlock_t; -- --#define ___SPIN_LOCK_INITIALIZER(lockname) \ -- { \ -- .raw_lock = __ARCH_SPIN_LOCK_UNLOCKED, \ -- SPIN_DEBUG_INIT(lockname) \ -- SPIN_DEP_MAP_INIT(lockname) } -- --#define __SPIN_LOCK_INITIALIZER(lockname) \ -- { { .rlock = ___SPIN_LOCK_INITIALIZER(lockname) } } -- --#define __SPIN_LOCK_UNLOCKED(lockname) \ -- (spinlock_t) __SPIN_LOCK_INITIALIZER(lockname) -- --#define DEFINE_SPINLOCK(x) spinlock_t x = __SPIN_LOCK_UNLOCKED(x) -+#include <linux/spinlock_types_nort.h> - - #include <linux/rwlock_types.h> - -diff --git a/include/linux/spinlock_types_nort.h b/include/linux/spinlock_types_nort.h -new file mode 100644 -index 000000000000..e4549f0dd197 ---- /dev/null -+++ b/include/linux/spinlock_types_nort.h -@@ -0,0 +1,39 @@ -+#ifndef __LINUX_SPINLOCK_TYPES_NORT_H -+#define __LINUX_SPINLOCK_TYPES_NORT_H -+ -+#ifndef __LINUX_SPINLOCK_TYPES_H -+#error "Do not include directly. Include spinlock_types.h instead" -+#endif -+ -+/* -+ * The non RT version maps spinlocks to raw_spinlocks -+ */ -+typedef struct spinlock { -+ union { -+ struct raw_spinlock rlock; -+ -+#ifdef CONFIG_DEBUG_LOCK_ALLOC -+# define LOCK_PADSIZE (offsetof(struct raw_spinlock, dep_map)) -+ struct { -+ u8 __padding[LOCK_PADSIZE]; -+ struct lockdep_map dep_map; -+ }; -+#endif -+ }; -+} spinlock_t; -+ -+#define ___SPIN_LOCK_INITIALIZER(lockname) \ -+{ \ -+ .raw_lock = __ARCH_SPIN_LOCK_UNLOCKED, \ -+ SPIN_DEBUG_INIT(lockname) \ -+ SPIN_DEP_MAP_INIT(lockname) } -+ -+#define __SPIN_LOCK_INITIALIZER(lockname) \ -+ { { .rlock = ___SPIN_LOCK_INITIALIZER(lockname) } } -+ -+#define __SPIN_LOCK_UNLOCKED(lockname) \ -+ (spinlock_t) __SPIN_LOCK_INITIALIZER(lockname) -+ -+#define DEFINE_SPINLOCK(x) spinlock_t x = __SPIN_LOCK_UNLOCKED(x) -+ -+#endif -diff --git a/include/linux/spinlock_types_raw.h b/include/linux/spinlock_types_raw.h -new file mode 100644 -index 000000000000..1d4a180e983d ---- /dev/null -+++ b/include/linux/spinlock_types_raw.h -@@ -0,0 +1,65 @@ -+#ifndef __LINUX_SPINLOCK_TYPES_RAW_H -+#define __LINUX_SPINLOCK_TYPES_RAW_H -+ -+#include <linux/types.h> -+ -+#if defined(CONFIG_SMP) -+# include <asm/spinlock_types.h> -+#else -+# include <linux/spinlock_types_up.h> -+#endif -+ -+#include <linux/lockdep_types.h> -+ -+typedef struct raw_spinlock { -+ arch_spinlock_t raw_lock; -+#ifdef CONFIG_DEBUG_SPINLOCK -+ unsigned int magic, owner_cpu; -+ void *owner; -+#endif -+#ifdef CONFIG_DEBUG_LOCK_ALLOC -+ struct lockdep_map dep_map; -+#endif -+} raw_spinlock_t; -+ -+#define SPINLOCK_MAGIC 0xdead4ead -+ -+#define SPINLOCK_OWNER_INIT ((void *)-1L) -+ -+#ifdef CONFIG_DEBUG_LOCK_ALLOC -+# define RAW_SPIN_DEP_MAP_INIT(lockname) \ -+ .dep_map = { \ -+ .name = #lockname, \ -+ .wait_type_inner = LD_WAIT_SPIN, \ -+ } -+# define SPIN_DEP_MAP_INIT(lockname) \ -+ .dep_map = { \ -+ .name = #lockname, \ -+ .wait_type_inner = LD_WAIT_CONFIG, \ -+ } -+#else -+# define RAW_SPIN_DEP_MAP_INIT(lockname) -+# define SPIN_DEP_MAP_INIT(lockname) -+#endif -+ -+#ifdef CONFIG_DEBUG_SPINLOCK -+# define SPIN_DEBUG_INIT(lockname) \ -+ .magic = SPINLOCK_MAGIC, \ -+ .owner_cpu = -1, \ -+ .owner = SPINLOCK_OWNER_INIT, -+#else -+# define SPIN_DEBUG_INIT(lockname) -+#endif -+ -+#define __RAW_SPIN_LOCK_INITIALIZER(lockname) \ -+{ \ -+ .raw_lock = __ARCH_SPIN_LOCK_UNLOCKED, \ -+ SPIN_DEBUG_INIT(lockname) \ -+ RAW_SPIN_DEP_MAP_INIT(lockname) } -+ -+#define __RAW_SPIN_LOCK_UNLOCKED(lockname) \ -+ (raw_spinlock_t) __RAW_SPIN_LOCK_INITIALIZER(lockname) -+ -+#define DEFINE_RAW_SPINLOCK(x) raw_spinlock_t x = __RAW_SPIN_LOCK_UNLOCKED(x) -+ -+#endif --- -2.30.2 - diff --git a/debian/patches-rt/0167-locking-rtmutex-Avoid-include-hell.patch b/debian/patches-rt/0167-locking-rtmutex-Avoid-include-hell.patch deleted file mode 100644 index b6d0f5729..000000000 --- a/debian/patches-rt/0167-locking-rtmutex-Avoid-include-hell.patch +++ /dev/null @@ -1,30 +0,0 @@ -From 7c71f8c6b9b19464c6562c5fcf1c308588ee76b3 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Wed, 29 Jun 2011 20:06:39 +0200 -Subject: [PATCH 167/296] locking/rtmutex: Avoid include hell -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Include only the required raw types. This avoids pulling in the -complete spinlock header which in turn requires rtmutex.h at some point. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - include/linux/rtmutex.h | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/include/linux/rtmutex.h b/include/linux/rtmutex.h -index 243fabc2c85f..add1dab27df5 100644 ---- a/include/linux/rtmutex.h -+++ b/include/linux/rtmutex.h -@@ -15,7 +15,7 @@ - - #include <linux/linkage.h> - #include <linux/rbtree.h> --#include <linux/spinlock_types.h> -+#include <linux/spinlock_types_raw.h> - - extern int max_lock_depth; /* for sysctl */ - --- -2.30.2 - diff --git a/debian/patches-rt/0168-lockdep-Reduce-header-files-in-debug_locks.h.patch b/debian/patches-rt/0168-lockdep-Reduce-header-files-in-debug_locks.h.patch deleted file mode 100644 index 30f1198b1..000000000 --- a/debian/patches-rt/0168-lockdep-Reduce-header-files-in-debug_locks.h.patch +++ /dev/null @@ -1,33 +0,0 @@ -From cc86f84d0a77aea1d611dbc6df80ace3263198d6 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Fri, 14 Aug 2020 16:55:25 +0200 -Subject: [PATCH 168/296] lockdep: Reduce header files in debug_locks.h -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The inclusion of printk.h leads to circular dependency if spinlock_t is -based on rt_mutex. - -Include only atomic.h (xchg()) and cache.h (__read_mostly). - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/debug_locks.h | 3 +-- - 1 file changed, 1 insertion(+), 2 deletions(-) - -diff --git a/include/linux/debug_locks.h b/include/linux/debug_locks.h -index 2915f56ad421..5a9e3e3769ce 100644 ---- a/include/linux/debug_locks.h -+++ b/include/linux/debug_locks.h -@@ -3,8 +3,7 @@ - #define __LINUX_DEBUG_LOCKING_H - - #include <linux/atomic.h> --#include <linux/bug.h> --#include <linux/printk.h> -+#include <linux/cache.h> - - struct task_struct; - --- -2.30.2 - diff --git a/debian/patches-rt/0169-locking-split-out-the-rbtree-definition.patch b/debian/patches-rt/0169-locking-split-out-the-rbtree-definition.patch deleted file mode 100644 index 9799ea2f2..000000000 --- a/debian/patches-rt/0169-locking-split-out-the-rbtree-definition.patch +++ /dev/null @@ -1,120 +0,0 @@ -From c3c28ab579cc831b8ca595f0e134bb613491bc95 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Fri, 14 Aug 2020 17:08:41 +0200 -Subject: [PATCH 169/296] locking: split out the rbtree definition -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -rtmutex.h needs the definition for rb_root_cached. By including kernel.h -we will get to spinlock.h which requires rtmutex.h again. - -Split out the required struct definition and move it into its own header -file which can be included by rtmutex.h - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/rbtree.h | 27 +-------------------------- - include/linux/rbtree_type.h | 31 +++++++++++++++++++++++++++++++ - include/linux/rtmutex.h | 2 +- - 3 files changed, 33 insertions(+), 27 deletions(-) - create mode 100644 include/linux/rbtree_type.h - -diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h -index d7db17996322..c33b0e16d04b 100644 ---- a/include/linux/rbtree.h -+++ b/include/linux/rbtree.h -@@ -19,19 +19,9 @@ - - #include <linux/kernel.h> - #include <linux/stddef.h> -+#include <linux/rbtree_type.h> - #include <linux/rcupdate.h> - --struct rb_node { -- unsigned long __rb_parent_color; -- struct rb_node *rb_right; -- struct rb_node *rb_left; --} __attribute__((aligned(sizeof(long)))); -- /* The alignment might seem pointless, but allegedly CRIS needs it */ -- --struct rb_root { -- struct rb_node *rb_node; --}; -- - #define rb_parent(r) ((struct rb_node *)((r)->__rb_parent_color & ~3)) - - #define RB_ROOT (struct rb_root) { NULL, } -@@ -112,21 +102,6 @@ static inline void rb_link_node_rcu(struct rb_node *node, struct rb_node *parent - typeof(*pos), field); 1; }); \ - pos = n) - --/* -- * Leftmost-cached rbtrees. -- * -- * We do not cache the rightmost node based on footprint -- * size vs number of potential users that could benefit -- * from O(1) rb_last(). Just not worth it, users that want -- * this feature can always implement the logic explicitly. -- * Furthermore, users that want to cache both pointers may -- * find it a bit asymmetric, but that's ok. -- */ --struct rb_root_cached { -- struct rb_root rb_root; -- struct rb_node *rb_leftmost; --}; -- - #define RB_ROOT_CACHED (struct rb_root_cached) { {NULL, }, NULL } - - /* Same as rb_first(), but O(1) */ -diff --git a/include/linux/rbtree_type.h b/include/linux/rbtree_type.h -new file mode 100644 -index 000000000000..77a89dd2c7c6 ---- /dev/null -+++ b/include/linux/rbtree_type.h -@@ -0,0 +1,31 @@ -+/* SPDX-License-Identifier: GPL-2.0-or-later */ -+#ifndef _LINUX_RBTREE_TYPE_H -+#define _LINUX_RBTREE_TYPE_H -+ -+struct rb_node { -+ unsigned long __rb_parent_color; -+ struct rb_node *rb_right; -+ struct rb_node *rb_left; -+} __attribute__((aligned(sizeof(long)))); -+/* The alignment might seem pointless, but allegedly CRIS needs it */ -+ -+struct rb_root { -+ struct rb_node *rb_node; -+}; -+ -+/* -+ * Leftmost-cached rbtrees. -+ * -+ * We do not cache the rightmost node based on footprint -+ * size vs number of potential users that could benefit -+ * from O(1) rb_last(). Just not worth it, users that want -+ * this feature can always implement the logic explicitly. -+ * Furthermore, users that want to cache both pointers may -+ * find it a bit asymmetric, but that's ok. -+ */ -+struct rb_root_cached { -+ struct rb_root rb_root; -+ struct rb_node *rb_leftmost; -+}; -+ -+#endif -diff --git a/include/linux/rtmutex.h b/include/linux/rtmutex.h -index add1dab27df5..b828b938c876 100644 ---- a/include/linux/rtmutex.h -+++ b/include/linux/rtmutex.h -@@ -14,7 +14,7 @@ - #define __LINUX_RT_MUTEX_H - - #include <linux/linkage.h> --#include <linux/rbtree.h> -+#include <linux/rbtree_type.h> - #include <linux/spinlock_types_raw.h> - - extern int max_lock_depth; /* for sysctl */ --- -2.30.2 - diff --git a/debian/patches-rt/0170-locking-rtmutex-Provide-rt_mutex_slowlock_locked.patch b/debian/patches-rt/0170-locking-rtmutex-Provide-rt_mutex_slowlock_locked.patch deleted file mode 100644 index ec9749cc2..000000000 --- a/debian/patches-rt/0170-locking-rtmutex-Provide-rt_mutex_slowlock_locked.patch +++ /dev/null @@ -1,145 +0,0 @@ -From bd9cecae1dc3c66e172989c8a9f0936177e1e2fb Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Thu, 12 Oct 2017 16:14:22 +0200 -Subject: [PATCH 170/296] locking/rtmutex: Provide rt_mutex_slowlock_locked() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -This is the inner-part of rt_mutex_slowlock(), required for rwsem-rt. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/locking/rtmutex.c | 67 +++++++++++++++++++-------------- - kernel/locking/rtmutex_common.h | 7 ++++ - 2 files changed, 45 insertions(+), 29 deletions(-) - -diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c -index 5148a2b49c55..e7645a09d0fb 100644 ---- a/kernel/locking/rtmutex.c -+++ b/kernel/locking/rtmutex.c -@@ -1234,35 +1234,16 @@ static void rt_mutex_handle_deadlock(int res, int detect_deadlock, - } - } - --/* -- * Slow path lock function: -- */ --static int __sched --rt_mutex_slowlock(struct rt_mutex *lock, int state, -- struct hrtimer_sleeper *timeout, -- enum rtmutex_chainwalk chwalk) -+int __sched rt_mutex_slowlock_locked(struct rt_mutex *lock, int state, -+ struct hrtimer_sleeper *timeout, -+ enum rtmutex_chainwalk chwalk, -+ struct rt_mutex_waiter *waiter) - { -- struct rt_mutex_waiter waiter; -- unsigned long flags; -- int ret = 0; -- -- rt_mutex_init_waiter(&waiter); -- -- /* -- * Technically we could use raw_spin_[un]lock_irq() here, but this can -- * be called in early boot if the cmpxchg() fast path is disabled -- * (debug, no architecture support). In this case we will acquire the -- * rtmutex with lock->wait_lock held. But we cannot unconditionally -- * enable interrupts in that early boot case. So we need to use the -- * irqsave/restore variants. -- */ -- raw_spin_lock_irqsave(&lock->wait_lock, flags); -+ int ret; - - /* Try to acquire the lock again: */ -- if (try_to_take_rt_mutex(lock, current, NULL)) { -- raw_spin_unlock_irqrestore(&lock->wait_lock, flags); -+ if (try_to_take_rt_mutex(lock, current, NULL)) - return 0; -- } - - set_current_state(state); - -@@ -1270,16 +1251,16 @@ rt_mutex_slowlock(struct rt_mutex *lock, int state, - if (unlikely(timeout)) - hrtimer_start_expires(&timeout->timer, HRTIMER_MODE_ABS); - -- ret = task_blocks_on_rt_mutex(lock, &waiter, current, chwalk); -+ ret = task_blocks_on_rt_mutex(lock, waiter, current, chwalk); - - if (likely(!ret)) - /* sleep on the mutex */ -- ret = __rt_mutex_slowlock(lock, state, timeout, &waiter); -+ ret = __rt_mutex_slowlock(lock, state, timeout, waiter); - - if (unlikely(ret)) { - __set_current_state(TASK_RUNNING); -- remove_waiter(lock, &waiter); -- rt_mutex_handle_deadlock(ret, chwalk, &waiter); -+ remove_waiter(lock, waiter); -+ rt_mutex_handle_deadlock(ret, chwalk, waiter); - } - - /* -@@ -1287,6 +1268,34 @@ rt_mutex_slowlock(struct rt_mutex *lock, int state, - * unconditionally. We might have to fix that up. - */ - fixup_rt_mutex_waiters(lock); -+ return ret; -+} -+ -+/* -+ * Slow path lock function: -+ */ -+static int __sched -+rt_mutex_slowlock(struct rt_mutex *lock, int state, -+ struct hrtimer_sleeper *timeout, -+ enum rtmutex_chainwalk chwalk) -+{ -+ struct rt_mutex_waiter waiter; -+ unsigned long flags; -+ int ret = 0; -+ -+ rt_mutex_init_waiter(&waiter); -+ -+ /* -+ * Technically we could use raw_spin_[un]lock_irq() here, but this can -+ * be called in early boot if the cmpxchg() fast path is disabled -+ * (debug, no architecture support). In this case we will acquire the -+ * rtmutex with lock->wait_lock held. But we cannot unconditionally -+ * enable interrupts in that early boot case. So we need to use the -+ * irqsave/restore variants. -+ */ -+ raw_spin_lock_irqsave(&lock->wait_lock, flags); -+ -+ ret = rt_mutex_slowlock_locked(lock, state, timeout, chwalk, &waiter); - - raw_spin_unlock_irqrestore(&lock->wait_lock, flags); - -diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_common.h -index 37cd6b3bf6f4..b5a2affa59d5 100644 ---- a/kernel/locking/rtmutex_common.h -+++ b/kernel/locking/rtmutex_common.h -@@ -15,6 +15,7 @@ - - #include <linux/rtmutex.h> - #include <linux/sched/wake_q.h> -+#include <linux/sched/debug.h> - - /* - * This is the control structure for tasks blocked on a rt_mutex, -@@ -153,6 +154,12 @@ extern bool __rt_mutex_futex_unlock(struct rt_mutex *lock, - struct wake_q_head *wqh); - - extern void rt_mutex_postunlock(struct wake_q_head *wake_q); -+/* RW semaphore special interface */ -+ -+int __sched rt_mutex_slowlock_locked(struct rt_mutex *lock, int state, -+ struct hrtimer_sleeper *timeout, -+ enum rtmutex_chainwalk chwalk, -+ struct rt_mutex_waiter *waiter); - - #ifdef CONFIG_DEBUG_RT_MUTEXES - # include "rtmutex-debug.h" --- -2.30.2 - diff --git a/debian/patches-rt/0171-locking-rtmutex-export-lockdep-less-version-of-rt_mu.patch b/debian/patches-rt/0171-locking-rtmutex-export-lockdep-less-version-of-rt_mu.patch deleted file mode 100644 index e529b9fbe..000000000 --- a/debian/patches-rt/0171-locking-rtmutex-export-lockdep-less-version-of-rt_mu.patch +++ /dev/null @@ -1,130 +0,0 @@ -From 0bc759dd2f6a6dc1bed5bcb76a2b0a6aaa1fdb4b Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Thu, 12 Oct 2017 16:36:39 +0200 -Subject: [PATCH 171/296] locking/rtmutex: export lockdep-less version of - rt_mutex's lock, trylock and unlock -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Required for lock implementation ontop of rtmutex. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/locking/rtmutex.c | 54 +++++++++++++++++++++++---------- - kernel/locking/rtmutex_common.h | 3 ++ - 2 files changed, 41 insertions(+), 16 deletions(-) - -diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c -index e7645a09d0fb..e59243a6d7f4 100644 ---- a/kernel/locking/rtmutex.c -+++ b/kernel/locking/rtmutex.c -@@ -1469,12 +1469,33 @@ rt_mutex_fastunlock(struct rt_mutex *lock, - rt_mutex_postunlock(&wake_q); - } - --static inline void __rt_mutex_lock(struct rt_mutex *lock, unsigned int subclass) -+int __sched __rt_mutex_lock_state(struct rt_mutex *lock, int state) - { - might_sleep(); -+ return rt_mutex_fastlock(lock, state, rt_mutex_slowlock); -+} -+ -+/** -+ * rt_mutex_lock_state - lock a rt_mutex with a given state -+ * -+ * @lock: The rt_mutex to be locked -+ * @state: The state to set when blocking on the rt_mutex -+ */ -+static inline int __sched rt_mutex_lock_state(struct rt_mutex *lock, -+ unsigned int subclass, int state) -+{ -+ int ret; - - mutex_acquire(&lock->dep_map, subclass, 0, _RET_IP_); -- rt_mutex_fastlock(lock, TASK_UNINTERRUPTIBLE, rt_mutex_slowlock); -+ ret = __rt_mutex_lock_state(lock, state); -+ if (ret) -+ mutex_release(&lock->dep_map, _RET_IP_); -+ return ret; -+} -+ -+static inline void __rt_mutex_lock(struct rt_mutex *lock, unsigned int subclass) -+{ -+ rt_mutex_lock_state(lock, subclass, TASK_UNINTERRUPTIBLE); - } - - #ifdef CONFIG_DEBUG_LOCK_ALLOC -@@ -1515,16 +1536,7 @@ EXPORT_SYMBOL_GPL(rt_mutex_lock); - */ - int __sched rt_mutex_lock_interruptible(struct rt_mutex *lock) - { -- int ret; -- -- might_sleep(); -- -- mutex_acquire(&lock->dep_map, 0, 0, _RET_IP_); -- ret = rt_mutex_fastlock(lock, TASK_INTERRUPTIBLE, rt_mutex_slowlock); -- if (ret) -- mutex_release(&lock->dep_map, _RET_IP_); -- -- return ret; -+ return rt_mutex_lock_state(lock, 0, TASK_INTERRUPTIBLE); - } - EXPORT_SYMBOL_GPL(rt_mutex_lock_interruptible); - -@@ -1541,6 +1553,14 @@ int __sched __rt_mutex_futex_trylock(struct rt_mutex *lock) - return __rt_mutex_slowtrylock(lock); - } - -+int __sched __rt_mutex_trylock(struct rt_mutex *lock) -+{ -+ if (WARN_ON_ONCE(in_irq() || in_nmi() || in_serving_softirq())) -+ return 0; -+ -+ return rt_mutex_fasttrylock(lock, rt_mutex_slowtrylock); -+} -+ - /** - * rt_mutex_trylock - try to lock a rt_mutex - * -@@ -1556,10 +1576,7 @@ int __sched rt_mutex_trylock(struct rt_mutex *lock) - { - int ret; - -- if (WARN_ON_ONCE(in_irq() || in_nmi() || in_serving_softirq())) -- return 0; -- -- ret = rt_mutex_fasttrylock(lock, rt_mutex_slowtrylock); -+ ret = __rt_mutex_trylock(lock); - if (ret) - mutex_acquire(&lock->dep_map, 0, 1, _RET_IP_); - -@@ -1567,6 +1584,11 @@ int __sched rt_mutex_trylock(struct rt_mutex *lock) - } - EXPORT_SYMBOL_GPL(rt_mutex_trylock); - -+void __sched __rt_mutex_unlock(struct rt_mutex *lock) -+{ -+ rt_mutex_fastunlock(lock, rt_mutex_slowunlock); -+} -+ - /** - * rt_mutex_unlock - unlock a rt_mutex - * -diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_common.h -index b5a2affa59d5..9d1e974ca9c3 100644 ---- a/kernel/locking/rtmutex_common.h -+++ b/kernel/locking/rtmutex_common.h -@@ -156,6 +156,9 @@ extern bool __rt_mutex_futex_unlock(struct rt_mutex *lock, - extern void rt_mutex_postunlock(struct wake_q_head *wake_q); - /* RW semaphore special interface */ - -+extern int __rt_mutex_lock_state(struct rt_mutex *lock, int state); -+extern int __rt_mutex_trylock(struct rt_mutex *lock); -+extern void __rt_mutex_unlock(struct rt_mutex *lock); - int __sched rt_mutex_slowlock_locked(struct rt_mutex *lock, int state, - struct hrtimer_sleeper *timeout, - enum rtmutex_chainwalk chwalk, --- -2.30.2 - diff --git a/debian/patches-rt/0172-sched-Add-saved_state-for-tasks-blocked-on-sleeping-.patch b/debian/patches-rt/0172-sched-Add-saved_state-for-tasks-blocked-on-sleeping-.patch deleted file mode 100644 index 6196609fd..000000000 --- a/debian/patches-rt/0172-sched-Add-saved_state-for-tasks-blocked-on-sleeping-.patch +++ /dev/null @@ -1,116 +0,0 @@ -From b5dad6705d942c5cecc690b89cb988847928f953 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Sat, 25 Jun 2011 09:21:04 +0200 -Subject: [PATCH 172/296] sched: Add saved_state for tasks blocked on sleeping - locks -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Spinlocks are state preserving in !RT. RT changes the state when a -task gets blocked on a lock. So we need to remember the state before -the lock contention. If a regular wakeup (not a RTmutex related -wakeup) happens, the saved_state is updated to running. When the lock -sleep is done, the saved state is restored. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - include/linux/sched.h | 3 +++ - kernel/sched/core.c | 34 ++++++++++++++++++++++++++++++++-- - kernel/sched/sched.h | 1 + - 3 files changed, 36 insertions(+), 2 deletions(-) - -diff --git a/include/linux/sched.h b/include/linux/sched.h -index 1a146988088d..796ea634da89 100644 ---- a/include/linux/sched.h -+++ b/include/linux/sched.h -@@ -655,6 +655,8 @@ struct task_struct { - #endif - /* -1 unrunnable, 0 runnable, >0 stopped: */ - volatile long state; -+ /* saved state for "spinlock sleepers" */ -+ volatile long saved_state; - - /* - * This begins the randomizable portion of task_struct. Only -@@ -1772,6 +1774,7 @@ extern struct task_struct *find_get_task_by_vpid(pid_t nr); - - extern int wake_up_state(struct task_struct *tsk, unsigned int state); - extern int wake_up_process(struct task_struct *tsk); -+extern int wake_up_lock_sleeper(struct task_struct *tsk); - extern void wake_up_new_task(struct task_struct *tsk); - - #ifdef CONFIG_SMP -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index 6763a6ec8715..200b37c704de 100644 ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -3282,7 +3282,7 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags) - int cpu, success = 0; - - preempt_disable(); -- if (p == current) { -+ if (!IS_ENABLED(CONFIG_PREEMPT_RT) && p == current) { - /* - * We're waking current, this means 'p->on_rq' and 'task_cpu(p) - * == smp_processor_id()'. Together this means we can special -@@ -3312,8 +3312,26 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags) - */ - raw_spin_lock_irqsave(&p->pi_lock, flags); - smp_mb__after_spinlock(); -- if (!(p->state & state)) -+ if (!(p->state & state)) { -+ /* -+ * The task might be running due to a spinlock sleeper -+ * wakeup. Check the saved state and set it to running -+ * if the wakeup condition is true. -+ */ -+ if (!(wake_flags & WF_LOCK_SLEEPER)) { -+ if (p->saved_state & state) { -+ p->saved_state = TASK_RUNNING; -+ success = 1; -+ } -+ } - goto unlock; -+ } -+ /* -+ * If this is a regular wakeup, then we can unconditionally -+ * clear the saved state of a "lock sleeper". -+ */ -+ if (!(wake_flags & WF_LOCK_SLEEPER)) -+ p->saved_state = TASK_RUNNING; - - trace_sched_waking(p); - -@@ -3502,6 +3520,18 @@ int wake_up_process(struct task_struct *p) - } - EXPORT_SYMBOL(wake_up_process); - -+/** -+ * wake_up_lock_sleeper - Wake up a specific process blocked on a "sleeping lock" -+ * @p: The process to be woken up. -+ * -+ * Same as wake_up_process() above, but wake_flags=WF_LOCK_SLEEPER to indicate -+ * the nature of the wakeup. -+ */ -+int wake_up_lock_sleeper(struct task_struct *p) -+{ -+ return try_to_wake_up(p, TASK_UNINTERRUPTIBLE, WF_LOCK_SLEEPER); -+} -+ - int wake_up_state(struct task_struct *p, unsigned int state) - { - return try_to_wake_up(p, state, 0); -diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h -index 605f803937f0..32ac4fc0ce76 100644 ---- a/kernel/sched/sched.h -+++ b/kernel/sched/sched.h -@@ -1744,6 +1744,7 @@ static inline int task_on_rq_migrating(struct task_struct *p) - #define WF_FORK 0x02 /* Child wakeup after fork */ - #define WF_MIGRATED 0x04 /* Internal use, task got migrated */ - #define WF_ON_CPU 0x08 /* Wakee is on_cpu */ -+#define WF_LOCK_SLEEPER 0x10 /* Wakeup spinlock "sleeper" */ - - /* - * To aid in avoiding the subversion of "niceness" due to uneven distribution --- -2.30.2 - diff --git a/debian/patches-rt/0173-locking-rtmutex-add-sleeping-lock-implementation.patch b/debian/patches-rt/0173-locking-rtmutex-add-sleeping-lock-implementation.patch deleted file mode 100644 index e4af8bb7b..000000000 --- a/debian/patches-rt/0173-locking-rtmutex-add-sleeping-lock-implementation.patch +++ /dev/null @@ -1,1226 +0,0 @@ -From 9b0917638eb5255ee232610f105b9c85618c254c Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Thu, 12 Oct 2017 17:11:19 +0200 -Subject: [PATCH 173/296] locking/rtmutex: add sleeping lock implementation -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/kernel.h | 5 + - include/linux/preempt.h | 4 + - include/linux/rtmutex.h | 19 +- - include/linux/sched.h | 7 + - include/linux/sched/wake_q.h | 13 +- - include/linux/spinlock_rt.h | 155 +++++++++++ - include/linux/spinlock_types_rt.h | 38 +++ - kernel/fork.c | 1 + - kernel/futex.c | 10 +- - kernel/locking/rtmutex.c | 444 +++++++++++++++++++++++++++--- - kernel/locking/rtmutex_common.h | 14 +- - kernel/sched/core.c | 39 ++- - 12 files changed, 694 insertions(+), 55 deletions(-) - create mode 100644 include/linux/spinlock_rt.h - create mode 100644 include/linux/spinlock_types_rt.h - -diff --git a/include/linux/kernel.h b/include/linux/kernel.h -index 665837f9a831..2cff7554395d 100644 ---- a/include/linux/kernel.h -+++ b/include/linux/kernel.h -@@ -220,6 +220,10 @@ extern void __cant_migrate(const char *file, int line); - */ - # define might_sleep() \ - do { __might_sleep(__FILE__, __LINE__, 0); might_resched(); } while (0) -+ -+# define might_sleep_no_state_check() \ -+ do { ___might_sleep(__FILE__, __LINE__, 0); might_resched(); } while (0) -+ - /** - * cant_sleep - annotation for functions that cannot sleep - * -@@ -263,6 +267,7 @@ extern void __cant_migrate(const char *file, int line); - static inline void __might_sleep(const char *file, int line, - int preempt_offset) { } - # define might_sleep() do { might_resched(); } while (0) -+# define might_sleep_no_state_check() do { might_resched(); } while (0) - # define cant_sleep() do { } while (0) - # define cant_migrate() do { } while (0) - # define sched_annotate_sleep() do { } while (0) -diff --git a/include/linux/preempt.h b/include/linux/preempt.h -index 9881eac0698f..4d244e295e85 100644 ---- a/include/linux/preempt.h -+++ b/include/linux/preempt.h -@@ -121,7 +121,11 @@ - /* - * The preempt_count offset after spin_lock() - */ -+#if !defined(CONFIG_PREEMPT_RT) - #define PREEMPT_LOCK_OFFSET PREEMPT_DISABLE_OFFSET -+#else -+#define PREEMPT_LOCK_OFFSET 0 -+#endif - - /* - * The preempt_count offset needed for things like: -diff --git a/include/linux/rtmutex.h b/include/linux/rtmutex.h -index b828b938c876..b02009f53026 100644 ---- a/include/linux/rtmutex.h -+++ b/include/linux/rtmutex.h -@@ -19,6 +19,10 @@ - - extern int max_lock_depth; /* for sysctl */ - -+#ifdef CONFIG_DEBUG_MUTEXES -+#include <linux/debug_locks.h> -+#endif -+ - /** - * The rt_mutex structure - * -@@ -31,6 +35,7 @@ struct rt_mutex { - raw_spinlock_t wait_lock; - struct rb_root_cached waiters; - struct task_struct *owner; -+ int save_state; - #ifdef CONFIG_DEBUG_LOCK_ALLOC - struct lockdep_map dep_map; - #endif -@@ -67,11 +72,19 @@ do { \ - #define __DEP_MAP_RT_MUTEX_INITIALIZER(mutexname) - #endif - --#define __RT_MUTEX_INITIALIZER(mutexname) \ -- { .wait_lock = __RAW_SPIN_LOCK_UNLOCKED(mutexname.wait_lock) \ -+#define __RT_MUTEX_INITIALIZER_PLAIN(mutexname) \ -+ .wait_lock = __RAW_SPIN_LOCK_UNLOCKED(mutexname.wait_lock) \ - , .waiters = RB_ROOT_CACHED \ - , .owner = NULL \ -- __DEP_MAP_RT_MUTEX_INITIALIZER(mutexname)} -+ __DEP_MAP_RT_MUTEX_INITIALIZER(mutexname) -+ -+#define __RT_MUTEX_INITIALIZER(mutexname) \ -+ { __RT_MUTEX_INITIALIZER_PLAIN(mutexname) \ -+ , .save_state = 0 } -+ -+#define __RT_MUTEX_INITIALIZER_SAVE_STATE(mutexname) \ -+ { __RT_MUTEX_INITIALIZER_PLAIN(mutexname) \ -+ , .save_state = 1 } - - #define DEFINE_RT_MUTEX(mutexname) \ - struct rt_mutex mutexname = __RT_MUTEX_INITIALIZER(mutexname) -diff --git a/include/linux/sched.h b/include/linux/sched.h -index 796ea634da89..16f9f402d111 100644 ---- a/include/linux/sched.h -+++ b/include/linux/sched.h -@@ -141,6 +141,9 @@ struct io_uring_task; - smp_store_mb(current->state, (state_value)); \ - } while (0) - -+#define __set_current_state_no_track(state_value) \ -+ current->state = (state_value); -+ - #define set_special_state(state_value) \ - do { \ - unsigned long flags; /* may shadow */ \ -@@ -194,6 +197,9 @@ struct io_uring_task; - #define set_current_state(state_value) \ - smp_store_mb(current->state, (state_value)) - -+#define __set_current_state_no_track(state_value) \ -+ __set_current_state(state_value) -+ - /* - * set_special_state() should be used for those states when the blocking task - * can not use the regular condition based wait-loop. In that case we must -@@ -1014,6 +1020,7 @@ struct task_struct { - raw_spinlock_t pi_lock; - - struct wake_q_node wake_q; -+ struct wake_q_node wake_q_sleeper; - - #ifdef CONFIG_RT_MUTEXES - /* PI waiters blocked on a rt_mutex held by this task: */ -diff --git a/include/linux/sched/wake_q.h b/include/linux/sched/wake_q.h -index 26a2013ac39c..6e2dff721547 100644 ---- a/include/linux/sched/wake_q.h -+++ b/include/linux/sched/wake_q.h -@@ -58,6 +58,17 @@ static inline bool wake_q_empty(struct wake_q_head *head) - - extern void wake_q_add(struct wake_q_head *head, struct task_struct *task); - extern void wake_q_add_safe(struct wake_q_head *head, struct task_struct *task); --extern void wake_up_q(struct wake_q_head *head); -+extern void wake_q_add_sleeper(struct wake_q_head *head, struct task_struct *task); -+extern void __wake_up_q(struct wake_q_head *head, bool sleeper); -+ -+static inline void wake_up_q(struct wake_q_head *head) -+{ -+ __wake_up_q(head, false); -+} -+ -+static inline void wake_up_q_sleeper(struct wake_q_head *head) -+{ -+ __wake_up_q(head, true); -+} - - #endif /* _LINUX_SCHED_WAKE_Q_H */ -diff --git a/include/linux/spinlock_rt.h b/include/linux/spinlock_rt.h -new file mode 100644 -index 000000000000..3085132eae38 ---- /dev/null -+++ b/include/linux/spinlock_rt.h -@@ -0,0 +1,155 @@ -+// SPDX-License-Identifier: GPL-2.0-only -+#ifndef __LINUX_SPINLOCK_RT_H -+#define __LINUX_SPINLOCK_RT_H -+ -+#ifndef __LINUX_SPINLOCK_H -+#error Do not include directly. Use spinlock.h -+#endif -+ -+#include <linux/bug.h> -+ -+extern void -+__rt_spin_lock_init(spinlock_t *lock, const char *name, struct lock_class_key *key); -+ -+#define spin_lock_init(slock) \ -+do { \ -+ static struct lock_class_key __key; \ -+ \ -+ rt_mutex_init(&(slock)->lock); \ -+ __rt_spin_lock_init(slock, #slock, &__key); \ -+} while (0) -+ -+extern void __lockfunc rt_spin_lock(spinlock_t *lock); -+extern void __lockfunc rt_spin_lock_nested(spinlock_t *lock, int subclass); -+extern void __lockfunc rt_spin_lock_nest_lock(spinlock_t *lock, struct lockdep_map *nest_lock); -+extern void __lockfunc rt_spin_unlock(spinlock_t *lock); -+extern void __lockfunc rt_spin_lock_unlock(spinlock_t *lock); -+extern int __lockfunc rt_spin_trylock_irqsave(spinlock_t *lock, unsigned long *flags); -+extern int __lockfunc rt_spin_trylock_bh(spinlock_t *lock); -+extern int __lockfunc rt_spin_trylock(spinlock_t *lock); -+extern int atomic_dec_and_spin_lock(atomic_t *atomic, spinlock_t *lock); -+ -+/* -+ * lockdep-less calls, for derived types like rwlock: -+ * (for trylock they can use rt_mutex_trylock() directly. -+ * Migrate disable handling must be done at the call site. -+ */ -+extern void __lockfunc __rt_spin_lock(struct rt_mutex *lock); -+extern void __lockfunc __rt_spin_trylock(struct rt_mutex *lock); -+extern void __lockfunc __rt_spin_unlock(struct rt_mutex *lock); -+ -+#define spin_lock(lock) rt_spin_lock(lock) -+ -+#define spin_lock_bh(lock) \ -+ do { \ -+ local_bh_disable(); \ -+ rt_spin_lock(lock); \ -+ } while (0) -+ -+#define spin_lock_irq(lock) spin_lock(lock) -+ -+#define spin_do_trylock(lock) __cond_lock(lock, rt_spin_trylock(lock)) -+ -+#define spin_trylock(lock) \ -+({ \ -+ int __locked; \ -+ __locked = spin_do_trylock(lock); \ -+ __locked; \ -+}) -+ -+#ifdef CONFIG_LOCKDEP -+# define spin_lock_nested(lock, subclass) \ -+ do { \ -+ rt_spin_lock_nested(lock, subclass); \ -+ } while (0) -+ -+#define spin_lock_bh_nested(lock, subclass) \ -+ do { \ -+ local_bh_disable(); \ -+ rt_spin_lock_nested(lock, subclass); \ -+ } while (0) -+ -+# define spin_lock_nest_lock(lock, subclass) \ -+ do { \ -+ typecheck(struct lockdep_map *, &(subclass)->dep_map); \ -+ rt_spin_lock_nest_lock(lock, &(subclass)->dep_map); \ -+ } while (0) -+ -+# define spin_lock_irqsave_nested(lock, flags, subclass) \ -+ do { \ -+ typecheck(unsigned long, flags); \ -+ flags = 0; \ -+ rt_spin_lock_nested(lock, subclass); \ -+ } while (0) -+#else -+# define spin_lock_nested(lock, subclass) spin_lock(((void)(subclass), (lock))) -+# define spin_lock_nest_lock(lock, subclass) spin_lock(((void)(subclass), (lock))) -+# define spin_lock_bh_nested(lock, subclass) spin_lock_bh(((void)(subclass), (lock))) -+ -+# define spin_lock_irqsave_nested(lock, flags, subclass) \ -+ do { \ -+ typecheck(unsigned long, flags); \ -+ flags = 0; \ -+ spin_lock(((void)(subclass), (lock))); \ -+ } while (0) -+#endif -+ -+#define spin_lock_irqsave(lock, flags) \ -+ do { \ -+ typecheck(unsigned long, flags); \ -+ flags = 0; \ -+ spin_lock(lock); \ -+ } while (0) -+ -+#define spin_unlock(lock) rt_spin_unlock(lock) -+ -+#define spin_unlock_bh(lock) \ -+ do { \ -+ rt_spin_unlock(lock); \ -+ local_bh_enable(); \ -+ } while (0) -+ -+#define spin_unlock_irq(lock) spin_unlock(lock) -+ -+#define spin_unlock_irqrestore(lock, flags) \ -+ do { \ -+ typecheck(unsigned long, flags); \ -+ (void) flags; \ -+ spin_unlock(lock); \ -+ } while (0) -+ -+#define spin_trylock_bh(lock) __cond_lock(lock, rt_spin_trylock_bh(lock)) -+#define spin_trylock_irq(lock) spin_trylock(lock) -+ -+#define spin_trylock_irqsave(lock, flags) \ -+({ \ -+ int __locked; \ -+ \ -+ typecheck(unsigned long, flags); \ -+ flags = 0; \ -+ __locked = spin_trylock(lock); \ -+ __locked; \ -+}) -+ -+#ifdef CONFIG_GENERIC_LOCKBREAK -+# define spin_is_contended(lock) ((lock)->break_lock) -+#else -+# define spin_is_contended(lock) (((void)(lock), 0)) -+#endif -+ -+static inline int spin_can_lock(spinlock_t *lock) -+{ -+ return !rt_mutex_is_locked(&lock->lock); -+} -+ -+static inline int spin_is_locked(spinlock_t *lock) -+{ -+ return rt_mutex_is_locked(&lock->lock); -+} -+ -+static inline void assert_spin_locked(spinlock_t *lock) -+{ -+ BUG_ON(!spin_is_locked(lock)); -+} -+ -+#endif -diff --git a/include/linux/spinlock_types_rt.h b/include/linux/spinlock_types_rt.h -new file mode 100644 -index 000000000000..446da786e5d5 ---- /dev/null -+++ b/include/linux/spinlock_types_rt.h -@@ -0,0 +1,38 @@ -+// SPDX-License-Identifier: GPL-2.0-only -+#ifndef __LINUX_SPINLOCK_TYPES_RT_H -+#define __LINUX_SPINLOCK_TYPES_RT_H -+ -+#ifndef __LINUX_SPINLOCK_TYPES_H -+#error "Do not include directly. Include spinlock_types.h instead" -+#endif -+ -+#include <linux/cache.h> -+ -+/* -+ * PREEMPT_RT: spinlocks - an RT mutex plus lock-break field: -+ */ -+typedef struct spinlock { -+ struct rt_mutex lock; -+ unsigned int break_lock; -+#ifdef CONFIG_DEBUG_LOCK_ALLOC -+ struct lockdep_map dep_map; -+#endif -+} spinlock_t; -+ -+#define __RT_SPIN_INITIALIZER(name) \ -+ { \ -+ .wait_lock = __RAW_SPIN_LOCK_UNLOCKED(name.wait_lock), \ -+ .save_state = 1, \ -+ } -+/* -+.wait_list = PLIST_HEAD_INIT_RAW((name).lock.wait_list, (name).lock.wait_lock) -+*/ -+ -+#define __SPIN_LOCK_UNLOCKED(name) \ -+ { .lock = __RT_SPIN_INITIALIZER(name.lock), \ -+ SPIN_DEP_MAP_INIT(name) } -+ -+#define DEFINE_SPINLOCK(name) \ -+ spinlock_t name = __SPIN_LOCK_UNLOCKED(name) -+ -+#endif -diff --git a/kernel/fork.c b/kernel/fork.c -index 356104c1e0a3..e903347d1bc1 100644 ---- a/kernel/fork.c -+++ b/kernel/fork.c -@@ -926,6 +926,7 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node) - tsk->splice_pipe = NULL; - tsk->task_frag.page = NULL; - tsk->wake_q.next = NULL; -+ tsk->wake_q_sleeper.next = NULL; - - account_kernel_stack(tsk, 1); - -diff --git a/kernel/futex.c b/kernel/futex.c -index 6d7aaa8fe439..90785b5a5b78 100644 ---- a/kernel/futex.c -+++ b/kernel/futex.c -@@ -1499,6 +1499,7 @@ static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_pi_state *pi_ - struct task_struct *new_owner; - bool postunlock = false; - DEFINE_WAKE_Q(wake_q); -+ DEFINE_WAKE_Q(wake_sleeper_q); - int ret = 0; - - new_owner = rt_mutex_next_owner(&pi_state->pi_mutex); -@@ -1548,14 +1549,15 @@ static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_pi_state *pi_ - * not fail. - */ - pi_state_update_owner(pi_state, new_owner); -- postunlock = __rt_mutex_futex_unlock(&pi_state->pi_mutex, &wake_q); -+ postunlock = __rt_mutex_futex_unlock(&pi_state->pi_mutex, &wake_q, -+ &wake_sleeper_q); - } - - out_unlock: - raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock); - - if (postunlock) -- rt_mutex_postunlock(&wake_q); -+ rt_mutex_postunlock(&wake_q, &wake_sleeper_q); - - return ret; - } -@@ -2858,7 +2860,7 @@ static int futex_lock_pi(u32 __user *uaddr, unsigned int flags, - goto no_block; - } - -- rt_mutex_init_waiter(&rt_waiter); -+ rt_mutex_init_waiter(&rt_waiter, false); - - /* - * On PREEMPT_RT_FULL, when hb->lock becomes an rt_mutex, we must not -@@ -3204,7 +3206,7 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags, - * The waiter is allocated on our stack, manipulated by the requeue - * code while we sleep on uaddr. - */ -- rt_mutex_init_waiter(&rt_waiter); -+ rt_mutex_init_waiter(&rt_waiter, false); - - ret = get_futex_key(uaddr2, flags & FLAGS_SHARED, &key2, FUTEX_WRITE); - if (unlikely(ret != 0)) -diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c -index e59243a6d7f4..7a38472b35ec 100644 ---- a/kernel/locking/rtmutex.c -+++ b/kernel/locking/rtmutex.c -@@ -8,6 +8,11 @@ - * Copyright (C) 2005-2006 Timesys Corp., Thomas Gleixner <tglx@timesys.com> - * Copyright (C) 2005 Kihon Technologies Inc., Steven Rostedt - * Copyright (C) 2006 Esben Nielsen -+ * Adaptive Spinlocks: -+ * Copyright (C) 2008 Novell, Inc., Gregory Haskins, Sven Dietrich, -+ * and Peter Morreale, -+ * Adaptive Spinlocks simplification: -+ * Copyright (C) 2008 Red Hat, Inc., Steven Rostedt <srostedt@redhat.com> - * - * See Documentation/locking/rt-mutex-design.rst for details. - */ -@@ -233,7 +238,7 @@ static inline bool unlock_rt_mutex_safe(struct rt_mutex *lock, - * Only use with rt_mutex_waiter_{less,equal}() - */ - #define task_to_waiter(p) \ -- &(struct rt_mutex_waiter){ .prio = (p)->prio, .deadline = (p)->dl.deadline } -+ &(struct rt_mutex_waiter){ .prio = (p)->prio, .deadline = (p)->dl.deadline, .task = (p) } - - static inline int - rt_mutex_waiter_less(struct rt_mutex_waiter *left, -@@ -273,6 +278,27 @@ rt_mutex_waiter_equal(struct rt_mutex_waiter *left, - return 1; - } - -+#define STEAL_NORMAL 0 -+#define STEAL_LATERAL 1 -+ -+static inline int -+rt_mutex_steal(struct rt_mutex *lock, struct rt_mutex_waiter *waiter, int mode) -+{ -+ struct rt_mutex_waiter *top_waiter = rt_mutex_top_waiter(lock); -+ -+ if (waiter == top_waiter || rt_mutex_waiter_less(waiter, top_waiter)) -+ return 1; -+ -+ /* -+ * Note that RT tasks are excluded from lateral-steals -+ * to prevent the introduction of an unbounded latency. -+ */ -+ if (mode == STEAL_NORMAL || rt_task(waiter->task)) -+ return 0; -+ -+ return rt_mutex_waiter_equal(waiter, top_waiter); -+} -+ - static void - rt_mutex_enqueue(struct rt_mutex *lock, struct rt_mutex_waiter *waiter) - { -@@ -377,6 +403,14 @@ static bool rt_mutex_cond_detect_deadlock(struct rt_mutex_waiter *waiter, - return debug_rt_mutex_detect_deadlock(waiter, chwalk); - } - -+static void rt_mutex_wake_waiter(struct rt_mutex_waiter *waiter) -+{ -+ if (waiter->savestate) -+ wake_up_lock_sleeper(waiter->task); -+ else -+ wake_up_process(waiter->task); -+} -+ - /* - * Max number of times we'll walk the boosting chain: - */ -@@ -700,13 +734,16 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task, - * follow here. This is the end of the chain we are walking. - */ - if (!rt_mutex_owner(lock)) { -+ struct rt_mutex_waiter *lock_top_waiter; -+ - /* - * If the requeue [7] above changed the top waiter, - * then we need to wake the new top waiter up to try - * to get the lock. - */ -- if (prerequeue_top_waiter != rt_mutex_top_waiter(lock)) -- wake_up_process(rt_mutex_top_waiter(lock)->task); -+ lock_top_waiter = rt_mutex_top_waiter(lock); -+ if (prerequeue_top_waiter != lock_top_waiter) -+ rt_mutex_wake_waiter(lock_top_waiter); - raw_spin_unlock_irq(&lock->wait_lock); - return 0; - } -@@ -807,9 +844,11 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task, - * @task: The task which wants to acquire the lock - * @waiter: The waiter that is queued to the lock's wait tree if the - * callsite called task_blocked_on_lock(), otherwise NULL -+ * @mode: Lock steal mode (STEAL_NORMAL, STEAL_LATERAL) - */ --static int try_to_take_rt_mutex(struct rt_mutex *lock, struct task_struct *task, -- struct rt_mutex_waiter *waiter) -+static int __try_to_take_rt_mutex(struct rt_mutex *lock, -+ struct task_struct *task, -+ struct rt_mutex_waiter *waiter, int mode) - { - lockdep_assert_held(&lock->wait_lock); - -@@ -845,12 +884,11 @@ static int try_to_take_rt_mutex(struct rt_mutex *lock, struct task_struct *task, - */ - if (waiter) { - /* -- * If waiter is not the highest priority waiter of -- * @lock, give up. -+ * If waiter is not the highest priority waiter of @lock, -+ * or its peer when lateral steal is allowed, give up. - */ -- if (waiter != rt_mutex_top_waiter(lock)) -+ if (!rt_mutex_steal(lock, waiter, mode)) - return 0; -- - /* - * We can acquire the lock. Remove the waiter from the - * lock waiters tree. -@@ -868,14 +906,12 @@ static int try_to_take_rt_mutex(struct rt_mutex *lock, struct task_struct *task, - */ - if (rt_mutex_has_waiters(lock)) { - /* -- * If @task->prio is greater than or equal to -- * the top waiter priority (kernel view), -- * @task lost. -+ * If @task->prio is greater than the top waiter -+ * priority (kernel view), or equal to it when a -+ * lateral steal is forbidden, @task lost. - */ -- if (!rt_mutex_waiter_less(task_to_waiter(task), -- rt_mutex_top_waiter(lock))) -+ if (!rt_mutex_steal(lock, task_to_waiter(task), mode)) - return 0; -- - /* - * The current top waiter stays enqueued. We - * don't have to change anything in the lock -@@ -922,6 +958,289 @@ static int try_to_take_rt_mutex(struct rt_mutex *lock, struct task_struct *task, - return 1; - } - -+#ifdef CONFIG_PREEMPT_RT -+/* -+ * preemptible spin_lock functions: -+ */ -+static inline void rt_spin_lock_fastlock(struct rt_mutex *lock, -+ void (*slowfn)(struct rt_mutex *lock)) -+{ -+ might_sleep_no_state_check(); -+ -+ if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current))) -+ return; -+ else -+ slowfn(lock); -+} -+ -+static inline void rt_spin_lock_fastunlock(struct rt_mutex *lock, -+ void (*slowfn)(struct rt_mutex *lock)) -+{ -+ if (likely(rt_mutex_cmpxchg_release(lock, current, NULL))) -+ return; -+ else -+ slowfn(lock); -+} -+#ifdef CONFIG_SMP -+/* -+ * Note that owner is a speculative pointer and dereferencing relies -+ * on rcu_read_lock() and the check against the lock owner. -+ */ -+static int adaptive_wait(struct rt_mutex *lock, -+ struct task_struct *owner) -+{ -+ int res = 0; -+ -+ rcu_read_lock(); -+ for (;;) { -+ if (owner != rt_mutex_owner(lock)) -+ break; -+ /* -+ * Ensure that owner->on_cpu is dereferenced _after_ -+ * checking the above to be valid. -+ */ -+ barrier(); -+ if (!owner->on_cpu) { -+ res = 1; -+ break; -+ } -+ cpu_relax(); -+ } -+ rcu_read_unlock(); -+ return res; -+} -+#else -+static int adaptive_wait(struct rt_mutex *lock, -+ struct task_struct *orig_owner) -+{ -+ return 1; -+} -+#endif -+ -+static int task_blocks_on_rt_mutex(struct rt_mutex *lock, -+ struct rt_mutex_waiter *waiter, -+ struct task_struct *task, -+ enum rtmutex_chainwalk chwalk); -+/* -+ * Slow path lock function spin_lock style: this variant is very -+ * careful not to miss any non-lock wakeups. -+ * -+ * We store the current state under p->pi_lock in p->saved_state and -+ * the try_to_wake_up() code handles this accordingly. -+ */ -+void __sched rt_spin_lock_slowlock_locked(struct rt_mutex *lock, -+ struct rt_mutex_waiter *waiter, -+ unsigned long flags) -+{ -+ struct task_struct *lock_owner, *self = current; -+ struct rt_mutex_waiter *top_waiter; -+ int ret; -+ -+ if (__try_to_take_rt_mutex(lock, self, NULL, STEAL_LATERAL)) -+ return; -+ -+ BUG_ON(rt_mutex_owner(lock) == self); -+ -+ /* -+ * We save whatever state the task is in and we'll restore it -+ * after acquiring the lock taking real wakeups into account -+ * as well. We are serialized via pi_lock against wakeups. See -+ * try_to_wake_up(). -+ */ -+ raw_spin_lock(&self->pi_lock); -+ self->saved_state = self->state; -+ __set_current_state_no_track(TASK_UNINTERRUPTIBLE); -+ raw_spin_unlock(&self->pi_lock); -+ -+ ret = task_blocks_on_rt_mutex(lock, waiter, self, RT_MUTEX_MIN_CHAINWALK); -+ BUG_ON(ret); -+ -+ for (;;) { -+ /* Try to acquire the lock again. */ -+ if (__try_to_take_rt_mutex(lock, self, waiter, STEAL_LATERAL)) -+ break; -+ -+ top_waiter = rt_mutex_top_waiter(lock); -+ lock_owner = rt_mutex_owner(lock); -+ -+ raw_spin_unlock_irqrestore(&lock->wait_lock, flags); -+ -+ if (top_waiter != waiter || adaptive_wait(lock, lock_owner)) -+ schedule(); -+ -+ raw_spin_lock_irqsave(&lock->wait_lock, flags); -+ -+ raw_spin_lock(&self->pi_lock); -+ __set_current_state_no_track(TASK_UNINTERRUPTIBLE); -+ raw_spin_unlock(&self->pi_lock); -+ } -+ -+ /* -+ * Restore the task state to current->saved_state. We set it -+ * to the original state above and the try_to_wake_up() code -+ * has possibly updated it when a real (non-rtmutex) wakeup -+ * happened while we were blocked. Clear saved_state so -+ * try_to_wakeup() does not get confused. -+ */ -+ raw_spin_lock(&self->pi_lock); -+ __set_current_state_no_track(self->saved_state); -+ self->saved_state = TASK_RUNNING; -+ raw_spin_unlock(&self->pi_lock); -+ -+ /* -+ * try_to_take_rt_mutex() sets the waiter bit -+ * unconditionally. We might have to fix that up: -+ */ -+ fixup_rt_mutex_waiters(lock); -+ -+ BUG_ON(rt_mutex_has_waiters(lock) && waiter == rt_mutex_top_waiter(lock)); -+ BUG_ON(!RB_EMPTY_NODE(&waiter->tree_entry)); -+} -+ -+static void noinline __sched rt_spin_lock_slowlock(struct rt_mutex *lock) -+{ -+ struct rt_mutex_waiter waiter; -+ unsigned long flags; -+ -+ rt_mutex_init_waiter(&waiter, true); -+ -+ raw_spin_lock_irqsave(&lock->wait_lock, flags); -+ rt_spin_lock_slowlock_locked(lock, &waiter, flags); -+ raw_spin_unlock_irqrestore(&lock->wait_lock, flags); -+ debug_rt_mutex_free_waiter(&waiter); -+} -+ -+static bool __sched __rt_mutex_unlock_common(struct rt_mutex *lock, -+ struct wake_q_head *wake_q, -+ struct wake_q_head *wq_sleeper); -+/* -+ * Slow path to release a rt_mutex spin_lock style -+ */ -+void __sched rt_spin_lock_slowunlock(struct rt_mutex *lock) -+{ -+ unsigned long flags; -+ DEFINE_WAKE_Q(wake_q); -+ DEFINE_WAKE_Q(wake_sleeper_q); -+ bool postunlock; -+ -+ raw_spin_lock_irqsave(&lock->wait_lock, flags); -+ postunlock = __rt_mutex_unlock_common(lock, &wake_q, &wake_sleeper_q); -+ raw_spin_unlock_irqrestore(&lock->wait_lock, flags); -+ -+ if (postunlock) -+ rt_mutex_postunlock(&wake_q, &wake_sleeper_q); -+} -+ -+void __lockfunc rt_spin_lock(spinlock_t *lock) -+{ -+ spin_acquire(&lock->dep_map, 0, 0, _RET_IP_); -+ rt_spin_lock_fastlock(&lock->lock, rt_spin_lock_slowlock); -+ migrate_disable(); -+} -+EXPORT_SYMBOL(rt_spin_lock); -+ -+void __lockfunc __rt_spin_lock(struct rt_mutex *lock) -+{ -+ rt_spin_lock_fastlock(lock, rt_spin_lock_slowlock); -+} -+ -+#ifdef CONFIG_DEBUG_LOCK_ALLOC -+void __lockfunc rt_spin_lock_nested(spinlock_t *lock, int subclass) -+{ -+ spin_acquire(&lock->dep_map, subclass, 0, _RET_IP_); -+ rt_spin_lock_fastlock(&lock->lock, rt_spin_lock_slowlock); -+ migrate_disable(); -+} -+EXPORT_SYMBOL(rt_spin_lock_nested); -+ -+void __lockfunc rt_spin_lock_nest_lock(spinlock_t *lock, -+ struct lockdep_map *nest_lock) -+{ -+ spin_acquire_nest(&lock->dep_map, 0, 0, nest_lock, _RET_IP_); -+ rt_spin_lock_fastlock(&lock->lock, rt_spin_lock_slowlock); -+ migrate_disable(); -+} -+EXPORT_SYMBOL(rt_spin_lock_nest_lock); -+#endif -+ -+void __lockfunc rt_spin_unlock(spinlock_t *lock) -+{ -+ /* NOTE: we always pass in '1' for nested, for simplicity */ -+ spin_release(&lock->dep_map, _RET_IP_); -+ migrate_enable(); -+ rt_spin_lock_fastunlock(&lock->lock, rt_spin_lock_slowunlock); -+} -+EXPORT_SYMBOL(rt_spin_unlock); -+ -+void __lockfunc __rt_spin_unlock(struct rt_mutex *lock) -+{ -+ rt_spin_lock_fastunlock(lock, rt_spin_lock_slowunlock); -+} -+EXPORT_SYMBOL(__rt_spin_unlock); -+ -+/* -+ * Wait for the lock to get unlocked: instead of polling for an unlock -+ * (like raw spinlocks do), we lock and unlock, to force the kernel to -+ * schedule if there's contention: -+ */ -+void __lockfunc rt_spin_lock_unlock(spinlock_t *lock) -+{ -+ spin_lock(lock); -+ spin_unlock(lock); -+} -+EXPORT_SYMBOL(rt_spin_lock_unlock); -+ -+int __lockfunc rt_spin_trylock(spinlock_t *lock) -+{ -+ int ret; -+ -+ ret = __rt_mutex_trylock(&lock->lock); -+ if (ret) { -+ spin_acquire(&lock->dep_map, 0, 1, _RET_IP_); -+ migrate_disable(); -+ } -+ return ret; -+} -+EXPORT_SYMBOL(rt_spin_trylock); -+ -+int __lockfunc rt_spin_trylock_bh(spinlock_t *lock) -+{ -+ int ret; -+ -+ local_bh_disable(); -+ ret = __rt_mutex_trylock(&lock->lock); -+ if (ret) { -+ spin_acquire(&lock->dep_map, 0, 1, _RET_IP_); -+ migrate_disable(); -+ } else { -+ local_bh_enable(); -+ } -+ return ret; -+} -+EXPORT_SYMBOL(rt_spin_trylock_bh); -+ -+void -+__rt_spin_lock_init(spinlock_t *lock, const char *name, struct lock_class_key *key) -+{ -+#ifdef CONFIG_DEBUG_LOCK_ALLOC -+ /* -+ * Make sure we are not reinitializing a held lock: -+ */ -+ debug_check_no_locks_freed((void *)lock, sizeof(*lock)); -+ lockdep_init_map(&lock->dep_map, name, key, 0); -+#endif -+} -+EXPORT_SYMBOL(__rt_spin_lock_init); -+ -+#endif /* PREEMPT_RT */ -+ -+static inline int -+try_to_take_rt_mutex(struct rt_mutex *lock, struct task_struct *task, -+ struct rt_mutex_waiter *waiter) -+{ -+ return __try_to_take_rt_mutex(lock, task, waiter, STEAL_NORMAL); -+} -+ - /* - * Task blocks on lock. - * -@@ -1035,6 +1354,7 @@ static int task_blocks_on_rt_mutex(struct rt_mutex *lock, - * Called with lock->wait_lock held and interrupts disabled. - */ - static void mark_wakeup_next_waiter(struct wake_q_head *wake_q, -+ struct wake_q_head *wake_sleeper_q, - struct rt_mutex *lock) - { - struct rt_mutex_waiter *waiter; -@@ -1074,7 +1394,10 @@ static void mark_wakeup_next_waiter(struct wake_q_head *wake_q, - * Pairs with preempt_enable() in rt_mutex_postunlock(); - */ - preempt_disable(); -- wake_q_add(wake_q, waiter->task); -+ if (waiter->savestate) -+ wake_q_add_sleeper(wake_sleeper_q, waiter->task); -+ else -+ wake_q_add(wake_q, waiter->task); - raw_spin_unlock(¤t->pi_lock); - } - -@@ -1158,21 +1481,22 @@ void rt_mutex_adjust_pi(struct task_struct *task) - return; - } - next_lock = waiter->lock; -- raw_spin_unlock_irqrestore(&task->pi_lock, flags); - - /* gets dropped in rt_mutex_adjust_prio_chain()! */ - get_task_struct(task); - -+ raw_spin_unlock_irqrestore(&task->pi_lock, flags); - rt_mutex_adjust_prio_chain(task, RT_MUTEX_MIN_CHAINWALK, NULL, - next_lock, NULL, task); - } - --void rt_mutex_init_waiter(struct rt_mutex_waiter *waiter) -+void rt_mutex_init_waiter(struct rt_mutex_waiter *waiter, bool savestate) - { - debug_rt_mutex_init_waiter(waiter); - RB_CLEAR_NODE(&waiter->pi_tree_entry); - RB_CLEAR_NODE(&waiter->tree_entry); - waiter->task = NULL; -+ waiter->savestate = savestate; - } - - /** -@@ -1283,7 +1607,7 @@ rt_mutex_slowlock(struct rt_mutex *lock, int state, - unsigned long flags; - int ret = 0; - -- rt_mutex_init_waiter(&waiter); -+ rt_mutex_init_waiter(&waiter, false); - - /* - * Technically we could use raw_spin_[un]lock_irq() here, but this can -@@ -1356,7 +1680,8 @@ static inline int rt_mutex_slowtrylock(struct rt_mutex *lock) - * Return whether the current task needs to call rt_mutex_postunlock(). - */ - static bool __sched rt_mutex_slowunlock(struct rt_mutex *lock, -- struct wake_q_head *wake_q) -+ struct wake_q_head *wake_q, -+ struct wake_q_head *wake_sleeper_q) - { - unsigned long flags; - -@@ -1410,7 +1735,7 @@ static bool __sched rt_mutex_slowunlock(struct rt_mutex *lock, - * - * Queue the next waiter for wakeup once we release the wait_lock. - */ -- mark_wakeup_next_waiter(wake_q, lock); -+ mark_wakeup_next_waiter(wake_q, wake_sleeper_q, lock); - raw_spin_unlock_irqrestore(&lock->wait_lock, flags); - - return true; /* call rt_mutex_postunlock() */ -@@ -1447,9 +1772,11 @@ rt_mutex_fasttrylock(struct rt_mutex *lock, - /* - * Performs the wakeup of the the top-waiter and re-enables preemption. - */ --void rt_mutex_postunlock(struct wake_q_head *wake_q) -+void rt_mutex_postunlock(struct wake_q_head *wake_q, -+ struct wake_q_head *wake_sleeper_q) - { - wake_up_q(wake_q); -+ wake_up_q_sleeper(wake_sleeper_q); - - /* Pairs with preempt_disable() in rt_mutex_slowunlock() */ - preempt_enable(); -@@ -1458,15 +1785,17 @@ void rt_mutex_postunlock(struct wake_q_head *wake_q) - static inline void - rt_mutex_fastunlock(struct rt_mutex *lock, - bool (*slowfn)(struct rt_mutex *lock, -- struct wake_q_head *wqh)) -+ struct wake_q_head *wqh, -+ struct wake_q_head *wq_sleeper)) - { - DEFINE_WAKE_Q(wake_q); -+ DEFINE_WAKE_Q(wake_sleeper_q); - - if (likely(rt_mutex_cmpxchg_release(lock, current, NULL))) - return; - -- if (slowfn(lock, &wake_q)) -- rt_mutex_postunlock(&wake_q); -+ if (slowfn(lock, &wake_q, &wake_sleeper_q)) -+ rt_mutex_postunlock(&wake_q, &wake_sleeper_q); - } - - int __sched __rt_mutex_lock_state(struct rt_mutex *lock, int state) -@@ -1597,16 +1926,13 @@ void __sched __rt_mutex_unlock(struct rt_mutex *lock) - void __sched rt_mutex_unlock(struct rt_mutex *lock) - { - mutex_release(&lock->dep_map, _RET_IP_); -- rt_mutex_fastunlock(lock, rt_mutex_slowunlock); -+ __rt_mutex_unlock(lock); - } - EXPORT_SYMBOL_GPL(rt_mutex_unlock); - --/** -- * Futex variant, that since futex variants do not use the fast-path, can be -- * simple and will not need to retry. -- */ --bool __sched __rt_mutex_futex_unlock(struct rt_mutex *lock, -- struct wake_q_head *wake_q) -+static bool __sched __rt_mutex_unlock_common(struct rt_mutex *lock, -+ struct wake_q_head *wake_q, -+ struct wake_q_head *wq_sleeper) - { - lockdep_assert_held(&lock->wait_lock); - -@@ -1623,23 +1949,35 @@ bool __sched __rt_mutex_futex_unlock(struct rt_mutex *lock, - * avoid inversion prior to the wakeup. preempt_disable() - * therein pairs with rt_mutex_postunlock(). - */ -- mark_wakeup_next_waiter(wake_q, lock); -+ mark_wakeup_next_waiter(wake_q, wq_sleeper, lock); - - return true; /* call postunlock() */ - } - -+/** -+ * Futex variant, that since futex variants do not use the fast-path, can be -+ * simple and will not need to retry. -+ */ -+bool __sched __rt_mutex_futex_unlock(struct rt_mutex *lock, -+ struct wake_q_head *wake_q, -+ struct wake_q_head *wq_sleeper) -+{ -+ return __rt_mutex_unlock_common(lock, wake_q, wq_sleeper); -+} -+ - void __sched rt_mutex_futex_unlock(struct rt_mutex *lock) - { - DEFINE_WAKE_Q(wake_q); -+ DEFINE_WAKE_Q(wake_sleeper_q); - unsigned long flags; - bool postunlock; - - raw_spin_lock_irqsave(&lock->wait_lock, flags); -- postunlock = __rt_mutex_futex_unlock(lock, &wake_q); -+ postunlock = __rt_mutex_futex_unlock(lock, &wake_q, &wake_sleeper_q); - raw_spin_unlock_irqrestore(&lock->wait_lock, flags); - - if (postunlock) -- rt_mutex_postunlock(&wake_q); -+ rt_mutex_postunlock(&wake_q, &wake_sleeper_q); - } - - /** -@@ -1675,7 +2013,7 @@ void __rt_mutex_init(struct rt_mutex *lock, const char *name, - if (name && key) - debug_rt_mutex_init(lock, name, key); - } --EXPORT_SYMBOL_GPL(__rt_mutex_init); -+EXPORT_SYMBOL(__rt_mutex_init); - - /** - * rt_mutex_init_proxy_locked - initialize and lock a rt_mutex on behalf of a -@@ -1695,6 +2033,14 @@ void rt_mutex_init_proxy_locked(struct rt_mutex *lock, - struct task_struct *proxy_owner) - { - __rt_mutex_init(lock, NULL, NULL); -+#ifdef CONFIG_DEBUG_SPINLOCK -+ /* -+ * get another key class for the wait_lock. LOCK_PI and UNLOCK_PI is -+ * holding the ->wait_lock of the proxy_lock while unlocking a sleeping -+ * lock. -+ */ -+ raw_spin_lock_init(&lock->wait_lock); -+#endif - debug_rt_mutex_proxy_lock(lock, proxy_owner); - rt_mutex_set_owner(lock, proxy_owner); - } -@@ -1717,6 +2063,26 @@ void rt_mutex_proxy_unlock(struct rt_mutex *lock) - rt_mutex_set_owner(lock, NULL); - } - -+static void fixup_rt_mutex_blocked(struct rt_mutex *lock) -+{ -+ struct task_struct *tsk = current; -+ /* -+ * RT has a problem here when the wait got interrupted by a timeout -+ * or a signal. task->pi_blocked_on is still set. The task must -+ * acquire the hash bucket lock when returning from this function. -+ * -+ * If the hash bucket lock is contended then the -+ * BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on)) in -+ * task_blocks_on_rt_mutex() will trigger. This can be avoided by -+ * clearing task->pi_blocked_on which removes the task from the -+ * boosting chain of the rtmutex. That's correct because the task -+ * is not longer blocked on it. -+ */ -+ raw_spin_lock(&tsk->pi_lock); -+ tsk->pi_blocked_on = NULL; -+ raw_spin_unlock(&tsk->pi_lock); -+} -+ - /** - * __rt_mutex_start_proxy_lock() - Start lock acquisition for another task - * @lock: the rt_mutex to take -@@ -1789,6 +2155,9 @@ int __rt_mutex_start_proxy_lock(struct rt_mutex *lock, - ret = 0; - } - -+ if (ret) -+ fixup_rt_mutex_blocked(lock); -+ - return ret; - } - -@@ -1878,6 +2247,9 @@ int rt_mutex_wait_proxy_lock(struct rt_mutex *lock, - * have to fix that up. - */ - fixup_rt_mutex_waiters(lock); -+ if (ret) -+ fixup_rt_mutex_blocked(lock); -+ - raw_spin_unlock_irq(&lock->wait_lock); - - return ret; -diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_common.h -index 9d1e974ca9c3..c1a280167e3c 100644 ---- a/kernel/locking/rtmutex_common.h -+++ b/kernel/locking/rtmutex_common.h -@@ -31,6 +31,7 @@ struct rt_mutex_waiter { - struct task_struct *task; - struct rt_mutex *lock; - int prio; -+ bool savestate; - u64 deadline; - }; - -@@ -133,7 +134,7 @@ extern struct task_struct *rt_mutex_next_owner(struct rt_mutex *lock); - extern void rt_mutex_init_proxy_locked(struct rt_mutex *lock, - struct task_struct *proxy_owner); - extern void rt_mutex_proxy_unlock(struct rt_mutex *lock); --extern void rt_mutex_init_waiter(struct rt_mutex_waiter *waiter); -+extern void rt_mutex_init_waiter(struct rt_mutex_waiter *waiter, bool savetate); - extern int __rt_mutex_start_proxy_lock(struct rt_mutex *lock, - struct rt_mutex_waiter *waiter, - struct task_struct *task); -@@ -151,9 +152,12 @@ extern int __rt_mutex_futex_trylock(struct rt_mutex *l); - - extern void rt_mutex_futex_unlock(struct rt_mutex *lock); - extern bool __rt_mutex_futex_unlock(struct rt_mutex *lock, -- struct wake_q_head *wqh); -+ struct wake_q_head *wqh, -+ struct wake_q_head *wq_sleeper); -+ -+extern void rt_mutex_postunlock(struct wake_q_head *wake_q, -+ struct wake_q_head *wake_sleeper_q); - --extern void rt_mutex_postunlock(struct wake_q_head *wake_q); - /* RW semaphore special interface */ - - extern int __rt_mutex_lock_state(struct rt_mutex *lock, int state); -@@ -163,6 +167,10 @@ int __sched rt_mutex_slowlock_locked(struct rt_mutex *lock, int state, - struct hrtimer_sleeper *timeout, - enum rtmutex_chainwalk chwalk, - struct rt_mutex_waiter *waiter); -+void __sched rt_spin_lock_slowlock_locked(struct rt_mutex *lock, -+ struct rt_mutex_waiter *waiter, -+ unsigned long flags); -+void __sched rt_spin_lock_slowunlock(struct rt_mutex *lock); - - #ifdef CONFIG_DEBUG_RT_MUTEXES - # include "rtmutex-debug.h" -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index 200b37c704de..4f197e699690 100644 ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -510,9 +510,15 @@ static bool set_nr_if_polling(struct task_struct *p) - #endif - #endif - --static bool __wake_q_add(struct wake_q_head *head, struct task_struct *task) -+static bool __wake_q_add(struct wake_q_head *head, struct task_struct *task, -+ bool sleeper) - { -- struct wake_q_node *node = &task->wake_q; -+ struct wake_q_node *node; -+ -+ if (sleeper) -+ node = &task->wake_q_sleeper; -+ else -+ node = &task->wake_q; - - /* - * Atomically grab the task, if ->wake_q is !nil already it means -@@ -548,7 +554,13 @@ static bool __wake_q_add(struct wake_q_head *head, struct task_struct *task) - */ - void wake_q_add(struct wake_q_head *head, struct task_struct *task) - { -- if (__wake_q_add(head, task)) -+ if (__wake_q_add(head, task, false)) -+ get_task_struct(task); -+} -+ -+void wake_q_add_sleeper(struct wake_q_head *head, struct task_struct *task) -+{ -+ if (__wake_q_add(head, task, true)) - get_task_struct(task); - } - -@@ -571,28 +583,39 @@ void wake_q_add(struct wake_q_head *head, struct task_struct *task) - */ - void wake_q_add_safe(struct wake_q_head *head, struct task_struct *task) - { -- if (!__wake_q_add(head, task)) -+ if (!__wake_q_add(head, task, false)) - put_task_struct(task); - } - --void wake_up_q(struct wake_q_head *head) -+void __wake_up_q(struct wake_q_head *head, bool sleeper) - { - struct wake_q_node *node = head->first; - - while (node != WAKE_Q_TAIL) { - struct task_struct *task; - -- task = container_of(node, struct task_struct, wake_q); -+ if (sleeper) -+ task = container_of(node, struct task_struct, wake_q_sleeper); -+ else -+ task = container_of(node, struct task_struct, wake_q); -+ - BUG_ON(!task); - /* Task can safely be re-inserted now: */ - node = node->next; -- task->wake_q.next = NULL; - -+ if (sleeper) -+ task->wake_q_sleeper.next = NULL; -+ else -+ task->wake_q.next = NULL; - /* - * wake_up_process() executes a full barrier, which pairs with - * the queueing in wake_q_add() so as not to miss wakeups. - */ -- wake_up_process(task); -+ if (sleeper) -+ wake_up_lock_sleeper(task); -+ else -+ wake_up_process(task); -+ - put_task_struct(task); - } - } --- -2.30.2 - diff --git a/debian/patches-rt/0174-locking-rtmutex-Allow-rt_mutex_trylock-on-PREEMPT_RT.patch b/debian/patches-rt/0174-locking-rtmutex-Allow-rt_mutex_trylock-on-PREEMPT_RT.patch deleted file mode 100644 index ca8bd1aac..000000000 --- a/debian/patches-rt/0174-locking-rtmutex-Allow-rt_mutex_trylock-on-PREEMPT_RT.patch +++ /dev/null @@ -1,37 +0,0 @@ -From 6ef278560937ad7632281e1789cbdb1ad46c76b1 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Wed, 2 Dec 2015 11:34:07 +0100 -Subject: [PATCH 174/296] locking/rtmutex: Allow rt_mutex_trylock() on - PREEMPT_RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Non PREEMPT_RT kernel can deadlock on rt_mutex_trylock() in softirq -context. -On PREEMPT_RT the softirq context is handled in thread context. This -avoids the deadlock in the slow path and PI-boosting will be done on the -correct thread. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/locking/rtmutex.c | 4 ++++ - 1 file changed, 4 insertions(+) - -diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c -index 7a38472b35ec..f983cb23edb5 100644 ---- a/kernel/locking/rtmutex.c -+++ b/kernel/locking/rtmutex.c -@@ -1884,7 +1884,11 @@ int __sched __rt_mutex_futex_trylock(struct rt_mutex *lock) - - int __sched __rt_mutex_trylock(struct rt_mutex *lock) - { -+#ifdef CONFIG_PREEMPT_RT -+ if (WARN_ON_ONCE(in_irq() || in_nmi())) -+#else - if (WARN_ON_ONCE(in_irq() || in_nmi() || in_serving_softirq())) -+#endif - return 0; - - return rt_mutex_fasttrylock(lock, rt_mutex_slowtrylock); --- -2.30.2 - diff --git a/debian/patches-rt/0175-locking-rtmutex-add-mutex-implementation-based-on-rt.patch b/debian/patches-rt/0175-locking-rtmutex-add-mutex-implementation-based-on-rt.patch deleted file mode 100644 index a2f3a297f..000000000 --- a/debian/patches-rt/0175-locking-rtmutex-add-mutex-implementation-based-on-rt.patch +++ /dev/null @@ -1,385 +0,0 @@ -From a59b740356b94771c4ee488e52a1d5276eaf446b Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Thu, 12 Oct 2017 17:17:03 +0200 -Subject: [PATCH 175/296] locking/rtmutex: add mutex implementation based on - rtmutex -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/mutex_rt.h | 130 ++++++++++++++++++++++ - kernel/locking/mutex-rt.c | 224 ++++++++++++++++++++++++++++++++++++++ - 2 files changed, 354 insertions(+) - create mode 100644 include/linux/mutex_rt.h - create mode 100644 kernel/locking/mutex-rt.c - -diff --git a/include/linux/mutex_rt.h b/include/linux/mutex_rt.h -new file mode 100644 -index 000000000000..f0b2e07cd5c5 ---- /dev/null -+++ b/include/linux/mutex_rt.h -@@ -0,0 +1,130 @@ -+// SPDX-License-Identifier: GPL-2.0-only -+#ifndef __LINUX_MUTEX_RT_H -+#define __LINUX_MUTEX_RT_H -+ -+#ifndef __LINUX_MUTEX_H -+#error "Please include mutex.h" -+#endif -+ -+#include <linux/rtmutex.h> -+ -+/* FIXME: Just for __lockfunc */ -+#include <linux/spinlock.h> -+ -+struct mutex { -+ struct rt_mutex lock; -+#ifdef CONFIG_DEBUG_LOCK_ALLOC -+ struct lockdep_map dep_map; -+#endif -+}; -+ -+#define __MUTEX_INITIALIZER(mutexname) \ -+ { \ -+ .lock = __RT_MUTEX_INITIALIZER(mutexname.lock) \ -+ __DEP_MAP_MUTEX_INITIALIZER(mutexname) \ -+ } -+ -+#define DEFINE_MUTEX(mutexname) \ -+ struct mutex mutexname = __MUTEX_INITIALIZER(mutexname) -+ -+extern void __mutex_do_init(struct mutex *lock, const char *name, struct lock_class_key *key); -+extern void __lockfunc _mutex_lock(struct mutex *lock); -+extern void __lockfunc _mutex_lock_io_nested(struct mutex *lock, int subclass); -+extern int __lockfunc _mutex_lock_interruptible(struct mutex *lock); -+extern int __lockfunc _mutex_lock_killable(struct mutex *lock); -+extern void __lockfunc _mutex_lock_nested(struct mutex *lock, int subclass); -+extern void __lockfunc _mutex_lock_nest_lock(struct mutex *lock, struct lockdep_map *nest_lock); -+extern int __lockfunc _mutex_lock_interruptible_nested(struct mutex *lock, int subclass); -+extern int __lockfunc _mutex_lock_killable_nested(struct mutex *lock, int subclass); -+extern int __lockfunc _mutex_trylock(struct mutex *lock); -+extern void __lockfunc _mutex_unlock(struct mutex *lock); -+ -+#define mutex_is_locked(l) rt_mutex_is_locked(&(l)->lock) -+#define mutex_lock(l) _mutex_lock(l) -+#define mutex_lock_interruptible(l) _mutex_lock_interruptible(l) -+#define mutex_lock_killable(l) _mutex_lock_killable(l) -+#define mutex_trylock(l) _mutex_trylock(l) -+#define mutex_unlock(l) _mutex_unlock(l) -+#define mutex_lock_io(l) _mutex_lock_io_nested(l, 0); -+ -+#define __mutex_owner(l) ((l)->lock.owner) -+ -+#ifdef CONFIG_DEBUG_MUTEXES -+#define mutex_destroy(l) rt_mutex_destroy(&(l)->lock) -+#else -+static inline void mutex_destroy(struct mutex *lock) {} -+#endif -+ -+#ifdef CONFIG_DEBUG_LOCK_ALLOC -+# define mutex_lock_nested(l, s) _mutex_lock_nested(l, s) -+# define mutex_lock_interruptible_nested(l, s) \ -+ _mutex_lock_interruptible_nested(l, s) -+# define mutex_lock_killable_nested(l, s) \ -+ _mutex_lock_killable_nested(l, s) -+# define mutex_lock_io_nested(l, s) _mutex_lock_io_nested(l, s) -+ -+# define mutex_lock_nest_lock(lock, nest_lock) \ -+do { \ -+ typecheck(struct lockdep_map *, &(nest_lock)->dep_map); \ -+ _mutex_lock_nest_lock(lock, &(nest_lock)->dep_map); \ -+} while (0) -+ -+#else -+# define mutex_lock_nested(l, s) _mutex_lock(l) -+# define mutex_lock_interruptible_nested(l, s) \ -+ _mutex_lock_interruptible(l) -+# define mutex_lock_killable_nested(l, s) \ -+ _mutex_lock_killable(l) -+# define mutex_lock_nest_lock(lock, nest_lock) mutex_lock(lock) -+# define mutex_lock_io_nested(l, s) _mutex_lock_io_nested(l, s) -+#endif -+ -+# define mutex_init(mutex) \ -+do { \ -+ static struct lock_class_key __key; \ -+ \ -+ rt_mutex_init(&(mutex)->lock); \ -+ __mutex_do_init((mutex), #mutex, &__key); \ -+} while (0) -+ -+# define __mutex_init(mutex, name, key) \ -+do { \ -+ rt_mutex_init(&(mutex)->lock); \ -+ __mutex_do_init((mutex), name, key); \ -+} while (0) -+ -+/** -+ * These values are chosen such that FAIL and SUCCESS match the -+ * values of the regular mutex_trylock(). -+ */ -+enum mutex_trylock_recursive_enum { -+ MUTEX_TRYLOCK_FAILED = 0, -+ MUTEX_TRYLOCK_SUCCESS = 1, -+ MUTEX_TRYLOCK_RECURSIVE, -+}; -+/** -+ * mutex_trylock_recursive - trylock variant that allows recursive locking -+ * @lock: mutex to be locked -+ * -+ * This function should not be used, _ever_. It is purely for hysterical GEM -+ * raisins, and once those are gone this will be removed. -+ * -+ * Returns: -+ * MUTEX_TRYLOCK_FAILED - trylock failed, -+ * MUTEX_TRYLOCK_SUCCESS - lock acquired, -+ * MUTEX_TRYLOCK_RECURSIVE - we already owned the lock. -+ */ -+int __rt_mutex_owner_current(struct rt_mutex *lock); -+ -+static inline /* __deprecated */ __must_check enum mutex_trylock_recursive_enum -+mutex_trylock_recursive(struct mutex *lock) -+{ -+ if (unlikely(__rt_mutex_owner_current(&lock->lock))) -+ return MUTEX_TRYLOCK_RECURSIVE; -+ -+ return mutex_trylock(lock); -+} -+ -+extern int atomic_dec_and_mutex_lock(atomic_t *cnt, struct mutex *lock); -+ -+#endif -diff --git a/kernel/locking/mutex-rt.c b/kernel/locking/mutex-rt.c -new file mode 100644 -index 000000000000..2b849e6b9b4a ---- /dev/null -+++ b/kernel/locking/mutex-rt.c -@@ -0,0 +1,224 @@ -+// SPDX-License-Identifier: GPL-2.0-only -+/* -+ * Real-Time Preemption Support -+ * -+ * started by Ingo Molnar: -+ * -+ * Copyright (C) 2004-2006 Red Hat, Inc., Ingo Molnar <mingo@redhat.com> -+ * Copyright (C) 2006, Timesys Corp., Thomas Gleixner <tglx@timesys.com> -+ * -+ * historic credit for proving that Linux spinlocks can be implemented via -+ * RT-aware mutexes goes to many people: The Pmutex project (Dirk Grambow -+ * and others) who prototyped it on 2.4 and did lots of comparative -+ * research and analysis; TimeSys, for proving that you can implement a -+ * fully preemptible kernel via the use of IRQ threading and mutexes; -+ * Bill Huey for persuasively arguing on lkml that the mutex model is the -+ * right one; and to MontaVista, who ported pmutexes to 2.6. -+ * -+ * This code is a from-scratch implementation and is not based on pmutexes, -+ * but the idea of converting spinlocks to mutexes is used here too. -+ * -+ * lock debugging, locking tree, deadlock detection: -+ * -+ * Copyright (C) 2004, LynuxWorks, Inc., Igor Manyilov, Bill Huey -+ * Released under the General Public License (GPL). -+ * -+ * Includes portions of the generic R/W semaphore implementation from: -+ * -+ * Copyright (c) 2001 David Howells (dhowells@redhat.com). -+ * - Derived partially from idea by Andrea Arcangeli <andrea@suse.de> -+ * - Derived also from comments by Linus -+ * -+ * Pending ownership of locks and ownership stealing: -+ * -+ * Copyright (C) 2005, Kihon Technologies Inc., Steven Rostedt -+ * -+ * (also by Steven Rostedt) -+ * - Converted single pi_lock to individual task locks. -+ * -+ * By Esben Nielsen: -+ * Doing priority inheritance with help of the scheduler. -+ * -+ * Copyright (C) 2006, Timesys Corp., Thomas Gleixner <tglx@timesys.com> -+ * - major rework based on Esben Nielsens initial patch -+ * - replaced thread_info references by task_struct refs -+ * - removed task->pending_owner dependency -+ * - BKL drop/reacquire for semaphore style locks to avoid deadlocks -+ * in the scheduler return path as discussed with Steven Rostedt -+ * -+ * Copyright (C) 2006, Kihon Technologies Inc. -+ * Steven Rostedt <rostedt@goodmis.org> -+ * - debugged and patched Thomas Gleixner's rework. -+ * - added back the cmpxchg to the rework. -+ * - turned atomic require back on for SMP. -+ */ -+ -+#include <linux/spinlock.h> -+#include <linux/rtmutex.h> -+#include <linux/sched.h> -+#include <linux/delay.h> -+#include <linux/module.h> -+#include <linux/kallsyms.h> -+#include <linux/syscalls.h> -+#include <linux/interrupt.h> -+#include <linux/plist.h> -+#include <linux/fs.h> -+#include <linux/futex.h> -+#include <linux/hrtimer.h> -+#include <linux/blkdev.h> -+ -+#include "rtmutex_common.h" -+ -+/* -+ * struct mutex functions -+ */ -+void __mutex_do_init(struct mutex *mutex, const char *name, -+ struct lock_class_key *key) -+{ -+#ifdef CONFIG_DEBUG_LOCK_ALLOC -+ /* -+ * Make sure we are not reinitializing a held lock: -+ */ -+ debug_check_no_locks_freed((void *)mutex, sizeof(*mutex)); -+ lockdep_init_map(&mutex->dep_map, name, key, 0); -+#endif -+ mutex->lock.save_state = 0; -+} -+EXPORT_SYMBOL(__mutex_do_init); -+ -+static int _mutex_lock_blk_flush(struct mutex *lock, int state) -+{ -+ /* -+ * Flush blk before ->pi_blocked_on is set. At schedule() time it is too -+ * late if one of the callbacks needs to acquire a sleeping lock. -+ */ -+ if (blk_needs_flush_plug(current)) -+ blk_schedule_flush_plug(current); -+ return __rt_mutex_lock_state(&lock->lock, state); -+} -+ -+void __lockfunc _mutex_lock(struct mutex *lock) -+{ -+ mutex_acquire(&lock->dep_map, 0, 0, _RET_IP_); -+ _mutex_lock_blk_flush(lock, TASK_UNINTERRUPTIBLE); -+} -+EXPORT_SYMBOL(_mutex_lock); -+ -+void __lockfunc _mutex_lock_io_nested(struct mutex *lock, int subclass) -+{ -+ int token; -+ -+ token = io_schedule_prepare(); -+ -+ mutex_acquire_nest(&lock->dep_map, subclass, 0, NULL, _RET_IP_); -+ __rt_mutex_lock_state(&lock->lock, TASK_UNINTERRUPTIBLE); -+ -+ io_schedule_finish(token); -+} -+EXPORT_SYMBOL_GPL(_mutex_lock_io_nested); -+ -+int __lockfunc _mutex_lock_interruptible(struct mutex *lock) -+{ -+ int ret; -+ -+ mutex_acquire(&lock->dep_map, 0, 0, _RET_IP_); -+ ret = _mutex_lock_blk_flush(lock, TASK_INTERRUPTIBLE); -+ if (ret) -+ mutex_release(&lock->dep_map, _RET_IP_); -+ return ret; -+} -+EXPORT_SYMBOL(_mutex_lock_interruptible); -+ -+int __lockfunc _mutex_lock_killable(struct mutex *lock) -+{ -+ int ret; -+ -+ mutex_acquire(&lock->dep_map, 0, 0, _RET_IP_); -+ ret = _mutex_lock_blk_flush(lock, TASK_KILLABLE); -+ if (ret) -+ mutex_release(&lock->dep_map, _RET_IP_); -+ return ret; -+} -+EXPORT_SYMBOL(_mutex_lock_killable); -+ -+#ifdef CONFIG_DEBUG_LOCK_ALLOC -+void __lockfunc _mutex_lock_nested(struct mutex *lock, int subclass) -+{ -+ mutex_acquire_nest(&lock->dep_map, subclass, 0, NULL, _RET_IP_); -+ _mutex_lock_blk_flush(lock, TASK_UNINTERRUPTIBLE); -+} -+EXPORT_SYMBOL(_mutex_lock_nested); -+ -+void __lockfunc _mutex_lock_nest_lock(struct mutex *lock, struct lockdep_map *nest) -+{ -+ mutex_acquire_nest(&lock->dep_map, 0, 0, nest, _RET_IP_); -+ _mutex_lock_blk_flush(lock, TASK_UNINTERRUPTIBLE); -+} -+EXPORT_SYMBOL(_mutex_lock_nest_lock); -+ -+int __lockfunc _mutex_lock_interruptible_nested(struct mutex *lock, int subclass) -+{ -+ int ret; -+ -+ mutex_acquire_nest(&lock->dep_map, subclass, 0, NULL, _RET_IP_); -+ ret = _mutex_lock_blk_flush(lock, TASK_INTERRUPTIBLE); -+ if (ret) -+ mutex_release(&lock->dep_map, _RET_IP_); -+ return ret; -+} -+EXPORT_SYMBOL(_mutex_lock_interruptible_nested); -+ -+int __lockfunc _mutex_lock_killable_nested(struct mutex *lock, int subclass) -+{ -+ int ret; -+ -+ mutex_acquire(&lock->dep_map, subclass, 0, _RET_IP_); -+ ret = _mutex_lock_blk_flush(lock, TASK_KILLABLE); -+ if (ret) -+ mutex_release(&lock->dep_map, _RET_IP_); -+ return ret; -+} -+EXPORT_SYMBOL(_mutex_lock_killable_nested); -+#endif -+ -+int __lockfunc _mutex_trylock(struct mutex *lock) -+{ -+ int ret = __rt_mutex_trylock(&lock->lock); -+ -+ if (ret) -+ mutex_acquire(&lock->dep_map, 0, 1, _RET_IP_); -+ -+ return ret; -+} -+EXPORT_SYMBOL(_mutex_trylock); -+ -+void __lockfunc _mutex_unlock(struct mutex *lock) -+{ -+ mutex_release(&lock->dep_map, _RET_IP_); -+ __rt_mutex_unlock(&lock->lock); -+} -+EXPORT_SYMBOL(_mutex_unlock); -+ -+/** -+ * atomic_dec_and_mutex_lock - return holding mutex if we dec to 0 -+ * @cnt: the atomic which we are to dec -+ * @lock: the mutex to return holding if we dec to 0 -+ * -+ * return true and hold lock if we dec to 0, return false otherwise -+ */ -+int atomic_dec_and_mutex_lock(atomic_t *cnt, struct mutex *lock) -+{ -+ /* dec if we can't possibly hit 0 */ -+ if (atomic_add_unless(cnt, -1, 1)) -+ return 0; -+ /* we might hit 0, so take the lock */ -+ mutex_lock(lock); -+ if (!atomic_dec_and_test(cnt)) { -+ /* when we actually did the dec, we didn't hit 0 */ -+ mutex_unlock(lock); -+ return 0; -+ } -+ /* we hit 0, and we hold the lock */ -+ return 1; -+} -+EXPORT_SYMBOL(atomic_dec_and_mutex_lock); --- -2.30.2 - diff --git a/debian/patches-rt/0176-locking-rtmutex-add-rwsem-implementation-based-on-rt.patch b/debian/patches-rt/0176-locking-rtmutex-add-rwsem-implementation-based-on-rt.patch deleted file mode 100644 index ad8017e8f..000000000 --- a/debian/patches-rt/0176-locking-rtmutex-add-rwsem-implementation-based-on-rt.patch +++ /dev/null @@ -1,455 +0,0 @@ -From fac7117af7ccbb58836e509af75c396f72a28f97 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Thu, 12 Oct 2017 17:28:34 +0200 -Subject: [PATCH 176/296] locking/rtmutex: add rwsem implementation based on - rtmutex -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The RT specific R/W semaphore implementation restricts the number of readers -to one because a writer cannot block on multiple readers and inherit its -priority or budget. - -The single reader restricting is painful in various ways: - - - Performance bottleneck for multi-threaded applications in the page fault - path (mmap sem) - - - Progress blocker for drivers which are carefully crafted to avoid the - potential reader/writer deadlock in mainline. - -The analysis of the writer code paths shows, that properly written RT tasks -should not take them. Syscalls like mmap(), file access which take mmap sem -write locked have unbound latencies which are completely unrelated to mmap -sem. Other R/W sem users like graphics drivers are not suitable for RT tasks -either. - -So there is little risk to hurt RT tasks when the RT rwsem implementation is -changed in the following way: - - - Allow concurrent readers - - - Make writers block until the last reader left the critical section. This - blocking is not subject to priority/budget inheritance. - - - Readers blocked on a writer inherit their priority/budget in the normal - way. - -There is a drawback with this scheme. R/W semaphores become writer unfair -though the applications which have triggered writer starvation (mostly on -mmap_sem) in the past are not really the typical workloads running on a RT -system. So while it's unlikely to hit writer starvation, it's possible. If -there are unexpected workloads on RT systems triggering it, we need to rethink -the approach. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/rwsem-rt.h | 70 +++++++++ - kernel/locking/rwsem-rt.c | 318 ++++++++++++++++++++++++++++++++++++++ - 2 files changed, 388 insertions(+) - create mode 100644 include/linux/rwsem-rt.h - create mode 100644 kernel/locking/rwsem-rt.c - -diff --git a/include/linux/rwsem-rt.h b/include/linux/rwsem-rt.h -new file mode 100644 -index 000000000000..0ba8aae9a198 ---- /dev/null -+++ b/include/linux/rwsem-rt.h -@@ -0,0 +1,70 @@ -+// SPDX-License-Identifier: GPL-2.0-only -+#ifndef _LINUX_RWSEM_RT_H -+#define _LINUX_RWSEM_RT_H -+ -+#ifndef _LINUX_RWSEM_H -+#error "Include rwsem.h" -+#endif -+ -+#include <linux/rtmutex.h> -+#include <linux/swait.h> -+ -+#define READER_BIAS (1U << 31) -+#define WRITER_BIAS (1U << 30) -+ -+struct rw_semaphore { -+ atomic_t readers; -+ struct rt_mutex rtmutex; -+#ifdef CONFIG_DEBUG_LOCK_ALLOC -+ struct lockdep_map dep_map; -+#endif -+}; -+ -+#define __RWSEM_INITIALIZER(name) \ -+{ \ -+ .readers = ATOMIC_INIT(READER_BIAS), \ -+ .rtmutex = __RT_MUTEX_INITIALIZER(name.rtmutex), \ -+ RW_DEP_MAP_INIT(name) \ -+} -+ -+#define DECLARE_RWSEM(lockname) \ -+ struct rw_semaphore lockname = __RWSEM_INITIALIZER(lockname) -+ -+extern void __rwsem_init(struct rw_semaphore *rwsem, const char *name, -+ struct lock_class_key *key); -+ -+#define __init_rwsem(sem, name, key) \ -+do { \ -+ rt_mutex_init(&(sem)->rtmutex); \ -+ __rwsem_init((sem), (name), (key)); \ -+} while (0) -+ -+#define init_rwsem(sem) \ -+do { \ -+ static struct lock_class_key __key; \ -+ \ -+ __init_rwsem((sem), #sem, &__key); \ -+} while (0) -+ -+static inline int rwsem_is_locked(struct rw_semaphore *sem) -+{ -+ return atomic_read(&sem->readers) != READER_BIAS; -+} -+ -+static inline int rwsem_is_contended(struct rw_semaphore *sem) -+{ -+ return atomic_read(&sem->readers) > 0; -+} -+ -+extern void __down_read(struct rw_semaphore *sem); -+extern int __down_read_interruptible(struct rw_semaphore *sem); -+extern int __down_read_killable(struct rw_semaphore *sem); -+extern int __down_read_trylock(struct rw_semaphore *sem); -+extern void __down_write(struct rw_semaphore *sem); -+extern int __must_check __down_write_killable(struct rw_semaphore *sem); -+extern int __down_write_trylock(struct rw_semaphore *sem); -+extern void __up_read(struct rw_semaphore *sem); -+extern void __up_write(struct rw_semaphore *sem); -+extern void __downgrade_write(struct rw_semaphore *sem); -+ -+#endif -diff --git a/kernel/locking/rwsem-rt.c b/kernel/locking/rwsem-rt.c -new file mode 100644 -index 000000000000..a0771c150041 ---- /dev/null -+++ b/kernel/locking/rwsem-rt.c -@@ -0,0 +1,318 @@ -+// SPDX-License-Identifier: GPL-2.0-only -+#include <linux/rwsem.h> -+#include <linux/sched/debug.h> -+#include <linux/sched/signal.h> -+#include <linux/export.h> -+#include <linux/blkdev.h> -+ -+#include "rtmutex_common.h" -+ -+/* -+ * RT-specific reader/writer semaphores -+ * -+ * down_write() -+ * 1) Lock sem->rtmutex -+ * 2) Remove the reader BIAS to force readers into the slow path -+ * 3) Wait until all readers have left the critical region -+ * 4) Mark it write locked -+ * -+ * up_write() -+ * 1) Remove the write locked marker -+ * 2) Set the reader BIAS so readers can use the fast path again -+ * 3) Unlock sem->rtmutex to release blocked readers -+ * -+ * down_read() -+ * 1) Try fast path acquisition (reader BIAS is set) -+ * 2) Take sem->rtmutex.wait_lock which protects the writelocked flag -+ * 3) If !writelocked, acquire it for read -+ * 4) If writelocked, block on sem->rtmutex -+ * 5) unlock sem->rtmutex, goto 1) -+ * -+ * up_read() -+ * 1) Try fast path release (reader count != 1) -+ * 2) Wake the writer waiting in down_write()#3 -+ * -+ * down_read()#3 has the consequence, that rw semaphores on RT are not writer -+ * fair, but writers, which should be avoided in RT tasks (think mmap_sem), -+ * are subject to the rtmutex priority/DL inheritance mechanism. -+ * -+ * It's possible to make the rw semaphores writer fair by keeping a list of -+ * active readers. A blocked writer would force all newly incoming readers to -+ * block on the rtmutex, but the rtmutex would have to be proxy locked for one -+ * reader after the other. We can't use multi-reader inheritance because there -+ * is no way to support that with SCHED_DEADLINE. Implementing the one by one -+ * reader boosting/handover mechanism is a major surgery for a very dubious -+ * value. -+ * -+ * The risk of writer starvation is there, but the pathological use cases -+ * which trigger it are not necessarily the typical RT workloads. -+ */ -+ -+void __rwsem_init(struct rw_semaphore *sem, const char *name, -+ struct lock_class_key *key) -+{ -+#ifdef CONFIG_DEBUG_LOCK_ALLOC -+ /* -+ * Make sure we are not reinitializing a held semaphore: -+ */ -+ debug_check_no_locks_freed((void *)sem, sizeof(*sem)); -+ lockdep_init_map(&sem->dep_map, name, key, 0); -+#endif -+ atomic_set(&sem->readers, READER_BIAS); -+} -+EXPORT_SYMBOL(__rwsem_init); -+ -+int __down_read_trylock(struct rw_semaphore *sem) -+{ -+ int r, old; -+ -+ /* -+ * Increment reader count, if sem->readers < 0, i.e. READER_BIAS is -+ * set. -+ */ -+ for (r = atomic_read(&sem->readers); r < 0;) { -+ old = atomic_cmpxchg(&sem->readers, r, r + 1); -+ if (likely(old == r)) -+ return 1; -+ r = old; -+ } -+ return 0; -+} -+ -+static int __sched __down_read_common(struct rw_semaphore *sem, int state) -+{ -+ struct rt_mutex *m = &sem->rtmutex; -+ struct rt_mutex_waiter waiter; -+ int ret; -+ -+ if (__down_read_trylock(sem)) -+ return 0; -+ -+ /* -+ * Flush blk before ->pi_blocked_on is set. At schedule() time it is too -+ * late if one of the callbacks needs to acquire a sleeping lock. -+ */ -+ if (blk_needs_flush_plug(current)) -+ blk_schedule_flush_plug(current); -+ -+ might_sleep(); -+ raw_spin_lock_irq(&m->wait_lock); -+ /* -+ * Allow readers as long as the writer has not completely -+ * acquired the semaphore for write. -+ */ -+ if (atomic_read(&sem->readers) != WRITER_BIAS) { -+ atomic_inc(&sem->readers); -+ raw_spin_unlock_irq(&m->wait_lock); -+ return 0; -+ } -+ -+ /* -+ * Call into the slow lock path with the rtmutex->wait_lock -+ * held, so this can't result in the following race: -+ * -+ * Reader1 Reader2 Writer -+ * down_read() -+ * down_write() -+ * rtmutex_lock(m) -+ * swait() -+ * down_read() -+ * unlock(m->wait_lock) -+ * up_read() -+ * swake() -+ * lock(m->wait_lock) -+ * sem->writelocked=true -+ * unlock(m->wait_lock) -+ * -+ * up_write() -+ * sem->writelocked=false -+ * rtmutex_unlock(m) -+ * down_read() -+ * down_write() -+ * rtmutex_lock(m) -+ * swait() -+ * rtmutex_lock(m) -+ * -+ * That would put Reader1 behind the writer waiting on -+ * Reader2 to call up_read() which might be unbound. -+ */ -+ rt_mutex_init_waiter(&waiter, false); -+ ret = rt_mutex_slowlock_locked(m, state, NULL, RT_MUTEX_MIN_CHAINWALK, -+ &waiter); -+ /* -+ * The slowlock() above is guaranteed to return with the rtmutex (for -+ * ret = 0) is now held, so there can't be a writer active. Increment -+ * the reader count and immediately drop the rtmutex again. -+ * For ret != 0 we don't hold the rtmutex and need unlock the wait_lock. -+ * We don't own the lock then. -+ */ -+ if (!ret) -+ atomic_inc(&sem->readers); -+ raw_spin_unlock_irq(&m->wait_lock); -+ if (!ret) -+ __rt_mutex_unlock(m); -+ -+ debug_rt_mutex_free_waiter(&waiter); -+ return ret; -+} -+ -+void __down_read(struct rw_semaphore *sem) -+{ -+ int ret; -+ -+ ret = __down_read_common(sem, TASK_UNINTERRUPTIBLE); -+ WARN_ON_ONCE(ret); -+} -+ -+int __down_read_interruptible(struct rw_semaphore *sem) -+{ -+ int ret; -+ -+ ret = __down_read_common(sem, TASK_INTERRUPTIBLE); -+ if (likely(!ret)) -+ return ret; -+ WARN_ONCE(ret != -EINTR, "Unexpected state: %d\n", ret); -+ return -EINTR; -+} -+ -+int __down_read_killable(struct rw_semaphore *sem) -+{ -+ int ret; -+ -+ ret = __down_read_common(sem, TASK_KILLABLE); -+ if (likely(!ret)) -+ return ret; -+ WARN_ONCE(ret != -EINTR, "Unexpected state: %d\n", ret); -+ return -EINTR; -+} -+ -+void __up_read(struct rw_semaphore *sem) -+{ -+ struct rt_mutex *m = &sem->rtmutex; -+ struct task_struct *tsk; -+ -+ /* -+ * sem->readers can only hit 0 when a writer is waiting for the -+ * active readers to leave the critical region. -+ */ -+ if (!atomic_dec_and_test(&sem->readers)) -+ return; -+ -+ might_sleep(); -+ raw_spin_lock_irq(&m->wait_lock); -+ /* -+ * Wake the writer, i.e. the rtmutex owner. It might release the -+ * rtmutex concurrently in the fast path (due to a signal), but to -+ * clean up the rwsem it needs to acquire m->wait_lock. The worst -+ * case which can happen is a spurious wakeup. -+ */ -+ tsk = rt_mutex_owner(m); -+ if (tsk) -+ wake_up_process(tsk); -+ -+ raw_spin_unlock_irq(&m->wait_lock); -+} -+ -+static void __up_write_unlock(struct rw_semaphore *sem, int bias, -+ unsigned long flags) -+{ -+ struct rt_mutex *m = &sem->rtmutex; -+ -+ atomic_add(READER_BIAS - bias, &sem->readers); -+ raw_spin_unlock_irqrestore(&m->wait_lock, flags); -+ __rt_mutex_unlock(m); -+} -+ -+static int __sched __down_write_common(struct rw_semaphore *sem, int state) -+{ -+ struct rt_mutex *m = &sem->rtmutex; -+ unsigned long flags; -+ -+ /* -+ * Flush blk before ->pi_blocked_on is set. At schedule() time it is too -+ * late if one of the callbacks needs to acquire a sleeping lock. -+ */ -+ if (blk_needs_flush_plug(current)) -+ blk_schedule_flush_plug(current); -+ -+ /* Take the rtmutex as a first step */ -+ if (__rt_mutex_lock_state(m, state)) -+ return -EINTR; -+ -+ /* Force readers into slow path */ -+ atomic_sub(READER_BIAS, &sem->readers); -+ might_sleep(); -+ -+ set_current_state(state); -+ for (;;) { -+ raw_spin_lock_irqsave(&m->wait_lock, flags); -+ /* Have all readers left the critical region? */ -+ if (!atomic_read(&sem->readers)) { -+ atomic_set(&sem->readers, WRITER_BIAS); -+ __set_current_state(TASK_RUNNING); -+ raw_spin_unlock_irqrestore(&m->wait_lock, flags); -+ return 0; -+ } -+ -+ if (signal_pending_state(state, current)) { -+ __set_current_state(TASK_RUNNING); -+ __up_write_unlock(sem, 0, flags); -+ return -EINTR; -+ } -+ raw_spin_unlock_irqrestore(&m->wait_lock, flags); -+ -+ if (atomic_read(&sem->readers) != 0) { -+ schedule(); -+ set_current_state(state); -+ } -+ } -+} -+ -+void __sched __down_write(struct rw_semaphore *sem) -+{ -+ __down_write_common(sem, TASK_UNINTERRUPTIBLE); -+} -+ -+int __sched __down_write_killable(struct rw_semaphore *sem) -+{ -+ return __down_write_common(sem, TASK_KILLABLE); -+} -+ -+int __down_write_trylock(struct rw_semaphore *sem) -+{ -+ struct rt_mutex *m = &sem->rtmutex; -+ unsigned long flags; -+ -+ if (!__rt_mutex_trylock(m)) -+ return 0; -+ -+ atomic_sub(READER_BIAS, &sem->readers); -+ -+ raw_spin_lock_irqsave(&m->wait_lock, flags); -+ if (!atomic_read(&sem->readers)) { -+ atomic_set(&sem->readers, WRITER_BIAS); -+ raw_spin_unlock_irqrestore(&m->wait_lock, flags); -+ return 1; -+ } -+ __up_write_unlock(sem, 0, flags); -+ return 0; -+} -+ -+void __up_write(struct rw_semaphore *sem) -+{ -+ struct rt_mutex *m = &sem->rtmutex; -+ unsigned long flags; -+ -+ raw_spin_lock_irqsave(&m->wait_lock, flags); -+ __up_write_unlock(sem, WRITER_BIAS, flags); -+} -+ -+void __downgrade_write(struct rw_semaphore *sem) -+{ -+ struct rt_mutex *m = &sem->rtmutex; -+ unsigned long flags; -+ -+ raw_spin_lock_irqsave(&m->wait_lock, flags); -+ /* Release it and account current as reader */ -+ __up_write_unlock(sem, WRITER_BIAS - 1, flags); -+} --- -2.30.2 - diff --git a/debian/patches-rt/0177-locking-rtmutex-add-rwlock-implementation-based-on-r.patch b/debian/patches-rt/0177-locking-rtmutex-add-rwlock-implementation-based-on-r.patch deleted file mode 100644 index 4fa0f647d..000000000 --- a/debian/patches-rt/0177-locking-rtmutex-add-rwlock-implementation-based-on-r.patch +++ /dev/null @@ -1,548 +0,0 @@ -From 61ecc595cda25b3c1204fae195dc9e963f9d1abf Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Thu, 12 Oct 2017 17:18:06 +0200 -Subject: [PATCH 177/296] locking/rtmutex: add rwlock implementation based on - rtmutex -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The implementation is bias-based, similar to the rwsem implementation. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/rwlock_rt.h | 109 +++++++++++ - include/linux/rwlock_types_rt.h | 56 ++++++ - kernel/Kconfig.locks | 2 +- - kernel/locking/rwlock-rt.c | 328 ++++++++++++++++++++++++++++++++ - 4 files changed, 494 insertions(+), 1 deletion(-) - create mode 100644 include/linux/rwlock_rt.h - create mode 100644 include/linux/rwlock_types_rt.h - create mode 100644 kernel/locking/rwlock-rt.c - -diff --git a/include/linux/rwlock_rt.h b/include/linux/rwlock_rt.h -new file mode 100644 -index 000000000000..aafdb0a685d5 ---- /dev/null -+++ b/include/linux/rwlock_rt.h -@@ -0,0 +1,109 @@ -+// SPDX-License-Identifier: GPL-2.0-only -+#ifndef __LINUX_RWLOCK_RT_H -+#define __LINUX_RWLOCK_RT_H -+ -+#ifndef __LINUX_SPINLOCK_H -+#error Do not include directly. Use spinlock.h -+#endif -+ -+extern void __lockfunc rt_write_lock(rwlock_t *rwlock); -+extern void __lockfunc rt_read_lock(rwlock_t *rwlock); -+extern int __lockfunc rt_write_trylock(rwlock_t *rwlock); -+extern int __lockfunc rt_read_trylock(rwlock_t *rwlock); -+extern void __lockfunc rt_write_unlock(rwlock_t *rwlock); -+extern void __lockfunc rt_read_unlock(rwlock_t *rwlock); -+extern int __lockfunc rt_read_can_lock(rwlock_t *rwlock); -+extern int __lockfunc rt_write_can_lock(rwlock_t *rwlock); -+extern void __rt_rwlock_init(rwlock_t *rwlock, char *name, struct lock_class_key *key); -+ -+#define read_can_lock(rwlock) rt_read_can_lock(rwlock) -+#define write_can_lock(rwlock) rt_write_can_lock(rwlock) -+ -+#define read_trylock(lock) __cond_lock(lock, rt_read_trylock(lock)) -+#define write_trylock(lock) __cond_lock(lock, rt_write_trylock(lock)) -+ -+static inline int __write_trylock_rt_irqsave(rwlock_t *lock, unsigned long *flags) -+{ -+ *flags = 0; -+ return rt_write_trylock(lock); -+} -+ -+#define write_trylock_irqsave(lock, flags) \ -+ __cond_lock(lock, __write_trylock_rt_irqsave(lock, &(flags))) -+ -+#define read_lock_irqsave(lock, flags) \ -+ do { \ -+ typecheck(unsigned long, flags); \ -+ rt_read_lock(lock); \ -+ flags = 0; \ -+ } while (0) -+ -+#define write_lock_irqsave(lock, flags) \ -+ do { \ -+ typecheck(unsigned long, flags); \ -+ rt_write_lock(lock); \ -+ flags = 0; \ -+ } while (0) -+ -+#define read_lock(lock) rt_read_lock(lock) -+ -+#define read_lock_bh(lock) \ -+ do { \ -+ local_bh_disable(); \ -+ rt_read_lock(lock); \ -+ } while (0) -+ -+#define read_lock_irq(lock) read_lock(lock) -+ -+#define write_lock(lock) rt_write_lock(lock) -+ -+#define write_lock_bh(lock) \ -+ do { \ -+ local_bh_disable(); \ -+ rt_write_lock(lock); \ -+ } while (0) -+ -+#define write_lock_irq(lock) write_lock(lock) -+ -+#define read_unlock(lock) rt_read_unlock(lock) -+ -+#define read_unlock_bh(lock) \ -+ do { \ -+ rt_read_unlock(lock); \ -+ local_bh_enable(); \ -+ } while (0) -+ -+#define read_unlock_irq(lock) read_unlock(lock) -+ -+#define write_unlock(lock) rt_write_unlock(lock) -+ -+#define write_unlock_bh(lock) \ -+ do { \ -+ rt_write_unlock(lock); \ -+ local_bh_enable(); \ -+ } while (0) -+ -+#define write_unlock_irq(lock) write_unlock(lock) -+ -+#define read_unlock_irqrestore(lock, flags) \ -+ do { \ -+ typecheck(unsigned long, flags); \ -+ (void) flags; \ -+ rt_read_unlock(lock); \ -+ } while (0) -+ -+#define write_unlock_irqrestore(lock, flags) \ -+ do { \ -+ typecheck(unsigned long, flags); \ -+ (void) flags; \ -+ rt_write_unlock(lock); \ -+ } while (0) -+ -+#define rwlock_init(rwl) \ -+do { \ -+ static struct lock_class_key __key; \ -+ \ -+ __rt_rwlock_init(rwl, #rwl, &__key); \ -+} while (0) -+ -+#endif -diff --git a/include/linux/rwlock_types_rt.h b/include/linux/rwlock_types_rt.h -new file mode 100644 -index 000000000000..4762391d659b ---- /dev/null -+++ b/include/linux/rwlock_types_rt.h -@@ -0,0 +1,56 @@ -+// SPDX-License-Identifier: GPL-2.0-only -+#ifndef __LINUX_RWLOCK_TYPES_RT_H -+#define __LINUX_RWLOCK_TYPES_RT_H -+ -+#ifndef __LINUX_SPINLOCK_TYPES_H -+#error "Do not include directly. Include spinlock_types.h instead" -+#endif -+ -+#ifdef CONFIG_DEBUG_LOCK_ALLOC -+# define RW_DEP_MAP_INIT(lockname) .dep_map = { .name = #lockname } -+#else -+# define RW_DEP_MAP_INIT(lockname) -+#endif -+ -+typedef struct rt_rw_lock rwlock_t; -+ -+#define __RW_LOCK_UNLOCKED(name) __RWLOCK_RT_INITIALIZER(name) -+ -+#define DEFINE_RWLOCK(name) \ -+ rwlock_t name = __RW_LOCK_UNLOCKED(name) -+ -+/* -+ * A reader biased implementation primarily for CPU pinning. -+ * -+ * Can be selected as general replacement for the single reader RT rwlock -+ * variant -+ */ -+struct rt_rw_lock { -+ struct rt_mutex rtmutex; -+ atomic_t readers; -+#ifdef CONFIG_DEBUG_LOCK_ALLOC -+ struct lockdep_map dep_map; -+#endif -+}; -+ -+#define READER_BIAS (1U << 31) -+#define WRITER_BIAS (1U << 30) -+ -+#define __RWLOCK_RT_INITIALIZER(name) \ -+{ \ -+ .readers = ATOMIC_INIT(READER_BIAS), \ -+ .rtmutex = __RT_MUTEX_INITIALIZER_SAVE_STATE(name.rtmutex), \ -+ RW_DEP_MAP_INIT(name) \ -+} -+ -+void __rwlock_biased_rt_init(struct rt_rw_lock *lock, const char *name, -+ struct lock_class_key *key); -+ -+#define rwlock_biased_rt_init(rwlock) \ -+ do { \ -+ static struct lock_class_key __key; \ -+ \ -+ __rwlock_biased_rt_init((rwlock), #rwlock, &__key); \ -+ } while (0) -+ -+#endif -diff --git a/kernel/Kconfig.locks b/kernel/Kconfig.locks -index 3de8fd11873b..4198f0273ecd 100644 ---- a/kernel/Kconfig.locks -+++ b/kernel/Kconfig.locks -@@ -251,7 +251,7 @@ config ARCH_USE_QUEUED_RWLOCKS - - config QUEUED_RWLOCKS - def_bool y if ARCH_USE_QUEUED_RWLOCKS -- depends on SMP -+ depends on SMP && !PREEMPT_RT - - config ARCH_HAS_MMIOWB - bool -diff --git a/kernel/locking/rwlock-rt.c b/kernel/locking/rwlock-rt.c -new file mode 100644 -index 000000000000..1ee16b8fedd7 ---- /dev/null -+++ b/kernel/locking/rwlock-rt.c -@@ -0,0 +1,328 @@ -+// SPDX-License-Identifier: GPL-2.0-only -+#include <linux/sched/debug.h> -+#include <linux/export.h> -+ -+#include "rtmutex_common.h" -+#include <linux/rwlock_types_rt.h> -+ -+/* -+ * RT-specific reader/writer locks -+ * -+ * write_lock() -+ * 1) Lock lock->rtmutex -+ * 2) Remove the reader BIAS to force readers into the slow path -+ * 3) Wait until all readers have left the critical region -+ * 4) Mark it write locked -+ * -+ * write_unlock() -+ * 1) Remove the write locked marker -+ * 2) Set the reader BIAS so readers can use the fast path again -+ * 3) Unlock lock->rtmutex to release blocked readers -+ * -+ * read_lock() -+ * 1) Try fast path acquisition (reader BIAS is set) -+ * 2) Take lock->rtmutex.wait_lock which protects the writelocked flag -+ * 3) If !writelocked, acquire it for read -+ * 4) If writelocked, block on lock->rtmutex -+ * 5) unlock lock->rtmutex, goto 1) -+ * -+ * read_unlock() -+ * 1) Try fast path release (reader count != 1) -+ * 2) Wake the writer waiting in write_lock()#3 -+ * -+ * read_lock()#3 has the consequence, that rw locks on RT are not writer -+ * fair, but writers, which should be avoided in RT tasks (think tasklist -+ * lock), are subject to the rtmutex priority/DL inheritance mechanism. -+ * -+ * It's possible to make the rw locks writer fair by keeping a list of -+ * active readers. A blocked writer would force all newly incoming readers -+ * to block on the rtmutex, but the rtmutex would have to be proxy locked -+ * for one reader after the other. We can't use multi-reader inheritance -+ * because there is no way to support that with -+ * SCHED_DEADLINE. Implementing the one by one reader boosting/handover -+ * mechanism is a major surgery for a very dubious value. -+ * -+ * The risk of writer starvation is there, but the pathological use cases -+ * which trigger it are not necessarily the typical RT workloads. -+ */ -+ -+void __rwlock_biased_rt_init(struct rt_rw_lock *lock, const char *name, -+ struct lock_class_key *key) -+{ -+#ifdef CONFIG_DEBUG_LOCK_ALLOC -+ /* -+ * Make sure we are not reinitializing a held semaphore: -+ */ -+ debug_check_no_locks_freed((void *)lock, sizeof(*lock)); -+ lockdep_init_map(&lock->dep_map, name, key, 0); -+#endif -+ atomic_set(&lock->readers, READER_BIAS); -+ rt_mutex_init(&lock->rtmutex); -+ lock->rtmutex.save_state = 1; -+} -+ -+static int __read_rt_trylock(struct rt_rw_lock *lock) -+{ -+ int r, old; -+ -+ /* -+ * Increment reader count, if lock->readers < 0, i.e. READER_BIAS is -+ * set. -+ */ -+ for (r = atomic_read(&lock->readers); r < 0;) { -+ old = atomic_cmpxchg(&lock->readers, r, r + 1); -+ if (likely(old == r)) -+ return 1; -+ r = old; -+ } -+ return 0; -+} -+ -+static void __read_rt_lock(struct rt_rw_lock *lock) -+{ -+ struct rt_mutex *m = &lock->rtmutex; -+ struct rt_mutex_waiter waiter; -+ unsigned long flags; -+ -+ if (__read_rt_trylock(lock)) -+ return; -+ -+ raw_spin_lock_irqsave(&m->wait_lock, flags); -+ /* -+ * Allow readers as long as the writer has not completely -+ * acquired the semaphore for write. -+ */ -+ if (atomic_read(&lock->readers) != WRITER_BIAS) { -+ atomic_inc(&lock->readers); -+ raw_spin_unlock_irqrestore(&m->wait_lock, flags); -+ return; -+ } -+ -+ /* -+ * Call into the slow lock path with the rtmutex->wait_lock -+ * held, so this can't result in the following race: -+ * -+ * Reader1 Reader2 Writer -+ * read_lock() -+ * write_lock() -+ * rtmutex_lock(m) -+ * swait() -+ * read_lock() -+ * unlock(m->wait_lock) -+ * read_unlock() -+ * swake() -+ * lock(m->wait_lock) -+ * lock->writelocked=true -+ * unlock(m->wait_lock) -+ * -+ * write_unlock() -+ * lock->writelocked=false -+ * rtmutex_unlock(m) -+ * read_lock() -+ * write_lock() -+ * rtmutex_lock(m) -+ * swait() -+ * rtmutex_lock(m) -+ * -+ * That would put Reader1 behind the writer waiting on -+ * Reader2 to call read_unlock() which might be unbound. -+ */ -+ rt_mutex_init_waiter(&waiter, true); -+ rt_spin_lock_slowlock_locked(m, &waiter, flags); -+ /* -+ * The slowlock() above is guaranteed to return with the rtmutex is -+ * now held, so there can't be a writer active. Increment the reader -+ * count and immediately drop the rtmutex again. -+ */ -+ atomic_inc(&lock->readers); -+ raw_spin_unlock_irqrestore(&m->wait_lock, flags); -+ rt_spin_lock_slowunlock(m); -+ -+ debug_rt_mutex_free_waiter(&waiter); -+} -+ -+static void __read_rt_unlock(struct rt_rw_lock *lock) -+{ -+ struct rt_mutex *m = &lock->rtmutex; -+ struct task_struct *tsk; -+ -+ /* -+ * sem->readers can only hit 0 when a writer is waiting for the -+ * active readers to leave the critical region. -+ */ -+ if (!atomic_dec_and_test(&lock->readers)) -+ return; -+ -+ raw_spin_lock_irq(&m->wait_lock); -+ /* -+ * Wake the writer, i.e. the rtmutex owner. It might release the -+ * rtmutex concurrently in the fast path, but to clean up the rw -+ * lock it needs to acquire m->wait_lock. The worst case which can -+ * happen is a spurious wakeup. -+ */ -+ tsk = rt_mutex_owner(m); -+ if (tsk) -+ wake_up_process(tsk); -+ -+ raw_spin_unlock_irq(&m->wait_lock); -+} -+ -+static void __write_unlock_common(struct rt_rw_lock *lock, int bias, -+ unsigned long flags) -+{ -+ struct rt_mutex *m = &lock->rtmutex; -+ -+ atomic_add(READER_BIAS - bias, &lock->readers); -+ raw_spin_unlock_irqrestore(&m->wait_lock, flags); -+ rt_spin_lock_slowunlock(m); -+} -+ -+static void __write_rt_lock(struct rt_rw_lock *lock) -+{ -+ struct rt_mutex *m = &lock->rtmutex; -+ struct task_struct *self = current; -+ unsigned long flags; -+ -+ /* Take the rtmutex as a first step */ -+ __rt_spin_lock(m); -+ -+ /* Force readers into slow path */ -+ atomic_sub(READER_BIAS, &lock->readers); -+ -+ raw_spin_lock_irqsave(&m->wait_lock, flags); -+ -+ raw_spin_lock(&self->pi_lock); -+ self->saved_state = self->state; -+ __set_current_state_no_track(TASK_UNINTERRUPTIBLE); -+ raw_spin_unlock(&self->pi_lock); -+ -+ for (;;) { -+ /* Have all readers left the critical region? */ -+ if (!atomic_read(&lock->readers)) { -+ atomic_set(&lock->readers, WRITER_BIAS); -+ raw_spin_lock(&self->pi_lock); -+ __set_current_state_no_track(self->saved_state); -+ self->saved_state = TASK_RUNNING; -+ raw_spin_unlock(&self->pi_lock); -+ raw_spin_unlock_irqrestore(&m->wait_lock, flags); -+ return; -+ } -+ -+ raw_spin_unlock_irqrestore(&m->wait_lock, flags); -+ -+ if (atomic_read(&lock->readers) != 0) -+ schedule(); -+ -+ raw_spin_lock_irqsave(&m->wait_lock, flags); -+ -+ raw_spin_lock(&self->pi_lock); -+ __set_current_state_no_track(TASK_UNINTERRUPTIBLE); -+ raw_spin_unlock(&self->pi_lock); -+ } -+} -+ -+static int __write_rt_trylock(struct rt_rw_lock *lock) -+{ -+ struct rt_mutex *m = &lock->rtmutex; -+ unsigned long flags; -+ -+ if (!__rt_mutex_trylock(m)) -+ return 0; -+ -+ atomic_sub(READER_BIAS, &lock->readers); -+ -+ raw_spin_lock_irqsave(&m->wait_lock, flags); -+ if (!atomic_read(&lock->readers)) { -+ atomic_set(&lock->readers, WRITER_BIAS); -+ raw_spin_unlock_irqrestore(&m->wait_lock, flags); -+ return 1; -+ } -+ __write_unlock_common(lock, 0, flags); -+ return 0; -+} -+ -+static void __write_rt_unlock(struct rt_rw_lock *lock) -+{ -+ struct rt_mutex *m = &lock->rtmutex; -+ unsigned long flags; -+ -+ raw_spin_lock_irqsave(&m->wait_lock, flags); -+ __write_unlock_common(lock, WRITER_BIAS, flags); -+} -+ -+int __lockfunc rt_read_can_lock(rwlock_t *rwlock) -+{ -+ return atomic_read(&rwlock->readers) < 0; -+} -+ -+int __lockfunc rt_write_can_lock(rwlock_t *rwlock) -+{ -+ return atomic_read(&rwlock->readers) == READER_BIAS; -+} -+ -+/* -+ * The common functions which get wrapped into the rwlock API. -+ */ -+int __lockfunc rt_read_trylock(rwlock_t *rwlock) -+{ -+ int ret; -+ -+ ret = __read_rt_trylock(rwlock); -+ if (ret) { -+ rwlock_acquire_read(&rwlock->dep_map, 0, 1, _RET_IP_); -+ migrate_disable(); -+ } -+ return ret; -+} -+EXPORT_SYMBOL(rt_read_trylock); -+ -+int __lockfunc rt_write_trylock(rwlock_t *rwlock) -+{ -+ int ret; -+ -+ ret = __write_rt_trylock(rwlock); -+ if (ret) { -+ rwlock_acquire(&rwlock->dep_map, 0, 1, _RET_IP_); -+ migrate_disable(); -+ } -+ return ret; -+} -+EXPORT_SYMBOL(rt_write_trylock); -+ -+void __lockfunc rt_read_lock(rwlock_t *rwlock) -+{ -+ rwlock_acquire_read(&rwlock->dep_map, 0, 0, _RET_IP_); -+ __read_rt_lock(rwlock); -+ migrate_disable(); -+} -+EXPORT_SYMBOL(rt_read_lock); -+ -+void __lockfunc rt_write_lock(rwlock_t *rwlock) -+{ -+ rwlock_acquire(&rwlock->dep_map, 0, 0, _RET_IP_); -+ __write_rt_lock(rwlock); -+ migrate_disable(); -+} -+EXPORT_SYMBOL(rt_write_lock); -+ -+void __lockfunc rt_read_unlock(rwlock_t *rwlock) -+{ -+ rwlock_release(&rwlock->dep_map, _RET_IP_); -+ migrate_enable(); -+ __read_rt_unlock(rwlock); -+} -+EXPORT_SYMBOL(rt_read_unlock); -+ -+void __lockfunc rt_write_unlock(rwlock_t *rwlock) -+{ -+ rwlock_release(&rwlock->dep_map, _RET_IP_); -+ migrate_enable(); -+ __write_rt_unlock(rwlock); -+} -+EXPORT_SYMBOL(rt_write_unlock); -+ -+void __rt_rwlock_init(rwlock_t *rwlock, char *name, struct lock_class_key *key) -+{ -+ __rwlock_biased_rt_init(rwlock, name, key); -+} -+EXPORT_SYMBOL(__rt_rwlock_init); --- -2.30.2 - diff --git a/debian/patches-rt/0178-locking-rtmutex-wire-up-RT-s-locking.patch b/debian/patches-rt/0178-locking-rtmutex-wire-up-RT-s-locking.patch deleted file mode 100644 index a1d9571f5..000000000 --- a/debian/patches-rt/0178-locking-rtmutex-wire-up-RT-s-locking.patch +++ /dev/null @@ -1,348 +0,0 @@ -From 4bb183116d9f6ffba96503652dbc917ea30a4b56 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Thu, 12 Oct 2017 17:31:14 +0200 -Subject: [PATCH 178/296] locking/rtmutex: wire up RT's locking -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/mutex.h | 26 ++++++++++++++++---------- - include/linux/rwsem.h | 12 ++++++++++++ - include/linux/spinlock.h | 12 +++++++++++- - include/linux/spinlock_api_smp.h | 4 +++- - include/linux/spinlock_types.h | 11 ++++++++--- - include/linux/spinlock_types_up.h | 2 +- - kernel/Kconfig.preempt | 1 + - kernel/locking/Makefile | 10 +++++++--- - kernel/locking/rwsem.c | 6 ++++++ - kernel/locking/spinlock.c | 7 +++++++ - kernel/locking/spinlock_debug.c | 5 +++++ - 11 files changed, 77 insertions(+), 19 deletions(-) - -diff --git a/include/linux/mutex.h b/include/linux/mutex.h -index 4d671fba3cab..e45774a337d2 100644 ---- a/include/linux/mutex.h -+++ b/include/linux/mutex.h -@@ -22,6 +22,20 @@ - - struct ww_acquire_ctx; - -+#ifdef CONFIG_DEBUG_LOCK_ALLOC -+# define __DEP_MAP_MUTEX_INITIALIZER(lockname) \ -+ , .dep_map = { \ -+ .name = #lockname, \ -+ .wait_type_inner = LD_WAIT_SLEEP, \ -+ } -+#else -+# define __DEP_MAP_MUTEX_INITIALIZER(lockname) -+#endif -+ -+#ifdef CONFIG_PREEMPT_RT -+# include <linux/mutex_rt.h> -+#else -+ - /* - * Simple, straightforward mutexes with strict semantics: - * -@@ -119,16 +133,6 @@ do { \ - __mutex_init((mutex), #mutex, &__key); \ - } while (0) - --#ifdef CONFIG_DEBUG_LOCK_ALLOC --# define __DEP_MAP_MUTEX_INITIALIZER(lockname) \ -- , .dep_map = { \ -- .name = #lockname, \ -- .wait_type_inner = LD_WAIT_SLEEP, \ -- } --#else --# define __DEP_MAP_MUTEX_INITIALIZER(lockname) --#endif -- - #define __MUTEX_INITIALIZER(lockname) \ - { .owner = ATOMIC_LONG_INIT(0) \ - , .wait_lock = __SPIN_LOCK_UNLOCKED(lockname.wait_lock) \ -@@ -224,4 +228,6 @@ enum mutex_trylock_recursive_enum { - extern /* __deprecated */ __must_check enum mutex_trylock_recursive_enum - mutex_trylock_recursive(struct mutex *lock); - -+#endif /* !PREEMPT_RT */ -+ - #endif /* __LINUX_MUTEX_H */ -diff --git a/include/linux/rwsem.h b/include/linux/rwsem.h -index 4c715be48717..9323af8a9244 100644 ---- a/include/linux/rwsem.h -+++ b/include/linux/rwsem.h -@@ -16,6 +16,11 @@ - #include <linux/spinlock.h> - #include <linux/atomic.h> - #include <linux/err.h> -+ -+#ifdef CONFIG_PREEMPT_RT -+#include <linux/rwsem-rt.h> -+#else /* PREEMPT_RT */ -+ - #ifdef CONFIG_RWSEM_SPIN_ON_OWNER - #include <linux/osq_lock.h> - #endif -@@ -119,6 +124,13 @@ static inline int rwsem_is_contended(struct rw_semaphore *sem) - return !list_empty(&sem->wait_list); - } - -+#endif /* !PREEMPT_RT */ -+ -+/* -+ * The functions below are the same for all rwsem implementations including -+ * the RT specific variant. -+ */ -+ - /* - * lock for reading - */ -diff --git a/include/linux/spinlock.h b/include/linux/spinlock.h -index 79897841a2cc..c3c70291b46c 100644 ---- a/include/linux/spinlock.h -+++ b/include/linux/spinlock.h -@@ -309,7 +309,11 @@ static inline void do_raw_spin_unlock(raw_spinlock_t *lock) __releases(lock) - }) - - /* Include rwlock functions */ --#include <linux/rwlock.h> -+#ifdef CONFIG_PREEMPT_RT -+# include <linux/rwlock_rt.h> -+#else -+# include <linux/rwlock.h> -+#endif - - /* - * Pull the _spin_*()/_read_*()/_write_*() functions/declarations: -@@ -320,6 +324,10 @@ static inline void do_raw_spin_unlock(raw_spinlock_t *lock) __releases(lock) - # include <linux/spinlock_api_up.h> - #endif - -+#ifdef CONFIG_PREEMPT_RT -+# include <linux/spinlock_rt.h> -+#else /* PREEMPT_RT */ -+ - /* - * Map the spin_lock functions to the raw variants for PREEMPT_RT=n - */ -@@ -454,6 +462,8 @@ static __always_inline int spin_is_contended(spinlock_t *lock) - - #define assert_spin_locked(lock) assert_raw_spin_locked(&(lock)->rlock) - -+#endif /* !PREEMPT_RT */ -+ - /* - * Pull the atomic_t declaration: - * (asm-mips/atomic.h needs above definitions) -diff --git a/include/linux/spinlock_api_smp.h b/include/linux/spinlock_api_smp.h -index 19a9be9d97ee..da38149f2843 100644 ---- a/include/linux/spinlock_api_smp.h -+++ b/include/linux/spinlock_api_smp.h -@@ -187,6 +187,8 @@ static inline int __raw_spin_trylock_bh(raw_spinlock_t *lock) - return 0; - } - --#include <linux/rwlock_api_smp.h> -+#ifndef CONFIG_PREEMPT_RT -+# include <linux/rwlock_api_smp.h> -+#endif - - #endif /* __LINUX_SPINLOCK_API_SMP_H */ -diff --git a/include/linux/spinlock_types.h b/include/linux/spinlock_types.h -index 5c8664d57fb8..8d896d3e1a01 100644 ---- a/include/linux/spinlock_types.h -+++ b/include/linux/spinlock_types.h -@@ -11,8 +11,13 @@ - - #include <linux/spinlock_types_raw.h> - --#include <linux/spinlock_types_nort.h> -- --#include <linux/rwlock_types.h> -+#ifndef CONFIG_PREEMPT_RT -+# include <linux/spinlock_types_nort.h> -+# include <linux/rwlock_types.h> -+#else -+# include <linux/rtmutex.h> -+# include <linux/spinlock_types_rt.h> -+# include <linux/rwlock_types_rt.h> -+#endif - - #endif /* __LINUX_SPINLOCK_TYPES_H */ -diff --git a/include/linux/spinlock_types_up.h b/include/linux/spinlock_types_up.h -index c09b6407ae1b..d9b371fa13e0 100644 ---- a/include/linux/spinlock_types_up.h -+++ b/include/linux/spinlock_types_up.h -@@ -1,7 +1,7 @@ - #ifndef __LINUX_SPINLOCK_TYPES_UP_H - #define __LINUX_SPINLOCK_TYPES_UP_H - --#ifndef __LINUX_SPINLOCK_TYPES_H -+#if !defined(__LINUX_SPINLOCK_TYPES_H) && !defined(__LINUX_RT_MUTEX_H) - # error "please don't include this file directly" - #endif - -diff --git a/kernel/Kconfig.preempt b/kernel/Kconfig.preempt -index bf82259cff96..cbe3aa495519 100644 ---- a/kernel/Kconfig.preempt -+++ b/kernel/Kconfig.preempt -@@ -59,6 +59,7 @@ config PREEMPT_RT - bool "Fully Preemptible Kernel (Real-Time)" - depends on EXPERT && ARCH_SUPPORTS_RT - select PREEMPTION -+ select RT_MUTEXES - help - This option turns the kernel into a real-time kernel by replacing - various locking primitives (spinlocks, rwlocks, etc.) with -diff --git a/kernel/locking/Makefile b/kernel/locking/Makefile -index 6d11cfb9b41f..c7fbf737e16e 100644 ---- a/kernel/locking/Makefile -+++ b/kernel/locking/Makefile -@@ -3,7 +3,7 @@ - # and is generally not a function of system call inputs. - KCOV_INSTRUMENT := n - --obj-y += mutex.o semaphore.o rwsem.o percpu-rwsem.o -+obj-y += semaphore.o rwsem.o percpu-rwsem.o - - # Avoid recursion lockdep -> KCSAN -> ... -> lockdep. - KCSAN_SANITIZE_lockdep.o := n -@@ -15,19 +15,23 @@ CFLAGS_REMOVE_mutex-debug.o = $(CC_FLAGS_FTRACE) - CFLAGS_REMOVE_rtmutex-debug.o = $(CC_FLAGS_FTRACE) - endif - --obj-$(CONFIG_DEBUG_MUTEXES) += mutex-debug.o - obj-$(CONFIG_LOCKDEP) += lockdep.o - ifeq ($(CONFIG_PROC_FS),y) - obj-$(CONFIG_LOCKDEP) += lockdep_proc.o - endif - obj-$(CONFIG_SMP) += spinlock.o --obj-$(CONFIG_LOCK_SPIN_ON_OWNER) += osq_lock.o - obj-$(CONFIG_PROVE_LOCKING) += spinlock.o - obj-$(CONFIG_QUEUED_SPINLOCKS) += qspinlock.o - obj-$(CONFIG_RT_MUTEXES) += rtmutex.o - obj-$(CONFIG_DEBUG_RT_MUTEXES) += rtmutex-debug.o - obj-$(CONFIG_DEBUG_SPINLOCK) += spinlock.o - obj-$(CONFIG_DEBUG_SPINLOCK) += spinlock_debug.o -+ifneq ($(CONFIG_PREEMPT_RT),y) -+obj-y += mutex.o -+obj-$(CONFIG_LOCK_SPIN_ON_OWNER) += osq_lock.o -+obj-$(CONFIG_DEBUG_MUTEXES) += mutex-debug.o -+endif -+obj-$(CONFIG_PREEMPT_RT) += mutex-rt.o rwsem-rt.o rwlock-rt.o - obj-$(CONFIG_QUEUED_RWLOCKS) += qrwlock.o - obj-$(CONFIG_LOCK_TORTURE_TEST) += locktorture.o - obj-$(CONFIG_WW_MUTEX_SELFTEST) += test-ww_mutex.o -diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c -index a163542d178e..3821c7204d10 100644 ---- a/kernel/locking/rwsem.c -+++ b/kernel/locking/rwsem.c -@@ -28,6 +28,7 @@ - #include <linux/rwsem.h> - #include <linux/atomic.h> - -+#ifndef CONFIG_PREEMPT_RT - #include "lock_events.h" - - /* -@@ -1494,6 +1495,7 @@ static inline void __downgrade_write(struct rw_semaphore *sem) - if (tmp & RWSEM_FLAG_WAITERS) - rwsem_downgrade_wake(sem); - } -+#endif - - /* - * lock for reading -@@ -1657,7 +1659,9 @@ void down_read_non_owner(struct rw_semaphore *sem) - { - might_sleep(); - __down_read(sem); -+#ifndef CONFIG_PREEMPT_RT - __rwsem_set_reader_owned(sem, NULL); -+#endif - } - EXPORT_SYMBOL(down_read_non_owner); - -@@ -1686,7 +1690,9 @@ EXPORT_SYMBOL(down_write_killable_nested); - - void up_read_non_owner(struct rw_semaphore *sem) - { -+#ifndef CONFIG_PREEMPT_RT - DEBUG_RWSEMS_WARN_ON(!is_rwsem_reader_owned(sem), sem); -+#endif - __up_read(sem); - } - EXPORT_SYMBOL(up_read_non_owner); -diff --git a/kernel/locking/spinlock.c b/kernel/locking/spinlock.c -index 0ff08380f531..45445a2f1799 100644 ---- a/kernel/locking/spinlock.c -+++ b/kernel/locking/spinlock.c -@@ -124,8 +124,11 @@ void __lockfunc __raw_##op##_lock_bh(locktype##_t *lock) \ - * __[spin|read|write]_lock_bh() - */ - BUILD_LOCK_OPS(spin, raw_spinlock); -+ -+#ifndef CONFIG_PREEMPT_RT - BUILD_LOCK_OPS(read, rwlock); - BUILD_LOCK_OPS(write, rwlock); -+#endif - - #endif - -@@ -209,6 +212,8 @@ void __lockfunc _raw_spin_unlock_bh(raw_spinlock_t *lock) - EXPORT_SYMBOL(_raw_spin_unlock_bh); - #endif - -+#ifndef CONFIG_PREEMPT_RT -+ - #ifndef CONFIG_INLINE_READ_TRYLOCK - int __lockfunc _raw_read_trylock(rwlock_t *lock) - { -@@ -353,6 +358,8 @@ void __lockfunc _raw_write_unlock_bh(rwlock_t *lock) - EXPORT_SYMBOL(_raw_write_unlock_bh); - #endif - -+#endif /* !PREEMPT_RT */ -+ - #ifdef CONFIG_DEBUG_LOCK_ALLOC - - void __lockfunc _raw_spin_lock_nested(raw_spinlock_t *lock, int subclass) -diff --git a/kernel/locking/spinlock_debug.c b/kernel/locking/spinlock_debug.c -index b9d93087ee66..72e306e0e8a3 100644 ---- a/kernel/locking/spinlock_debug.c -+++ b/kernel/locking/spinlock_debug.c -@@ -31,6 +31,7 @@ void __raw_spin_lock_init(raw_spinlock_t *lock, const char *name, - - EXPORT_SYMBOL(__raw_spin_lock_init); - -+#ifndef CONFIG_PREEMPT_RT - void __rwlock_init(rwlock_t *lock, const char *name, - struct lock_class_key *key) - { -@@ -48,6 +49,7 @@ void __rwlock_init(rwlock_t *lock, const char *name, - } - - EXPORT_SYMBOL(__rwlock_init); -+#endif - - static void spin_dump(raw_spinlock_t *lock, const char *msg) - { -@@ -139,6 +141,7 @@ void do_raw_spin_unlock(raw_spinlock_t *lock) - arch_spin_unlock(&lock->raw_lock); - } - -+#ifndef CONFIG_PREEMPT_RT - static void rwlock_bug(rwlock_t *lock, const char *msg) - { - if (!debug_locks_off()) -@@ -228,3 +231,5 @@ void do_raw_write_unlock(rwlock_t *lock) - debug_write_unlock(lock); - arch_write_unlock(&lock->raw_lock); - } -+ -+#endif --- -2.30.2 - diff --git a/debian/patches-rt/0179-locking-rtmutex-add-ww_mutex-addon-for-mutex-rt.patch b/debian/patches-rt/0179-locking-rtmutex-add-ww_mutex-addon-for-mutex-rt.patch deleted file mode 100644 index afe31f597..000000000 --- a/debian/patches-rt/0179-locking-rtmutex-add-ww_mutex-addon-for-mutex-rt.patch +++ /dev/null @@ -1,456 +0,0 @@ -From a11fafd5ea9661b8ed2054c505c2c8623baa15bf Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Thu, 12 Oct 2017 17:34:38 +0200 -Subject: [PATCH 179/296] locking/rtmutex: add ww_mutex addon for mutex-rt -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/mutex.h | 8 - - include/linux/ww_mutex.h | 8 + - kernel/locking/rtmutex.c | 262 ++++++++++++++++++++++++++++++-- - kernel/locking/rtmutex_common.h | 2 + - kernel/locking/rwsem-rt.c | 2 +- - 5 files changed, 262 insertions(+), 20 deletions(-) - -diff --git a/include/linux/mutex.h b/include/linux/mutex.h -index e45774a337d2..90923d3008fc 100644 ---- a/include/linux/mutex.h -+++ b/include/linux/mutex.h -@@ -82,14 +82,6 @@ struct mutex { - struct ww_class; - struct ww_acquire_ctx; - --struct ww_mutex { -- struct mutex base; -- struct ww_acquire_ctx *ctx; --#ifdef CONFIG_DEBUG_MUTEXES -- struct ww_class *ww_class; --#endif --}; -- - /* - * This is the control structure for tasks blocked on mutex, - * which resides on the blocked task's kernel stack: -diff --git a/include/linux/ww_mutex.h b/include/linux/ww_mutex.h -index 6ecf2a0220db..3145de598645 100644 ---- a/include/linux/ww_mutex.h -+++ b/include/linux/ww_mutex.h -@@ -28,6 +28,14 @@ struct ww_class { - unsigned int is_wait_die; - }; - -+struct ww_mutex { -+ struct mutex base; -+ struct ww_acquire_ctx *ctx; -+#ifdef CONFIG_DEBUG_MUTEXES -+ struct ww_class *ww_class; -+#endif -+}; -+ - struct ww_acquire_ctx { - struct task_struct *task; - unsigned long stamp; -diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c -index f983cb23edb5..50a6dd124746 100644 ---- a/kernel/locking/rtmutex.c -+++ b/kernel/locking/rtmutex.c -@@ -24,6 +24,7 @@ - #include <linux/sched/wake_q.h> - #include <linux/sched/debug.h> - #include <linux/timer.h> -+#include <linux/ww_mutex.h> - - #include "rtmutex_common.h" - -@@ -1234,6 +1235,40 @@ EXPORT_SYMBOL(__rt_spin_lock_init); - - #endif /* PREEMPT_RT */ - -+#ifdef CONFIG_PREEMPT_RT -+ static inline int __sched -+__mutex_lock_check_stamp(struct rt_mutex *lock, struct ww_acquire_ctx *ctx) -+{ -+ struct ww_mutex *ww = container_of(lock, struct ww_mutex, base.lock); -+ struct ww_acquire_ctx *hold_ctx = READ_ONCE(ww->ctx); -+ -+ if (!hold_ctx) -+ return 0; -+ -+ if (unlikely(ctx == hold_ctx)) -+ return -EALREADY; -+ -+ if (ctx->stamp - hold_ctx->stamp <= LONG_MAX && -+ (ctx->stamp != hold_ctx->stamp || ctx > hold_ctx)) { -+#ifdef CONFIG_DEBUG_MUTEXES -+ DEBUG_LOCKS_WARN_ON(ctx->contending_lock); -+ ctx->contending_lock = ww; -+#endif -+ return -EDEADLK; -+ } -+ -+ return 0; -+} -+#else -+ static inline int __sched -+__mutex_lock_check_stamp(struct rt_mutex *lock, struct ww_acquire_ctx *ctx) -+{ -+ BUG(); -+ return 0; -+} -+ -+#endif -+ - static inline int - try_to_take_rt_mutex(struct rt_mutex *lock, struct task_struct *task, - struct rt_mutex_waiter *waiter) -@@ -1512,7 +1547,8 @@ void rt_mutex_init_waiter(struct rt_mutex_waiter *waiter, bool savestate) - static int __sched - __rt_mutex_slowlock(struct rt_mutex *lock, int state, - struct hrtimer_sleeper *timeout, -- struct rt_mutex_waiter *waiter) -+ struct rt_mutex_waiter *waiter, -+ struct ww_acquire_ctx *ww_ctx) - { - int ret = 0; - -@@ -1530,6 +1566,12 @@ __rt_mutex_slowlock(struct rt_mutex *lock, int state, - break; - } - -+ if (ww_ctx && ww_ctx->acquired > 0) { -+ ret = __mutex_lock_check_stamp(lock, ww_ctx); -+ if (ret) -+ break; -+ } -+ - raw_spin_unlock_irq(&lock->wait_lock); - - schedule(); -@@ -1558,16 +1600,106 @@ static void rt_mutex_handle_deadlock(int res, int detect_deadlock, - } - } - -+static __always_inline void ww_mutex_lock_acquired(struct ww_mutex *ww, -+ struct ww_acquire_ctx *ww_ctx) -+{ -+#ifdef CONFIG_DEBUG_MUTEXES -+ /* -+ * If this WARN_ON triggers, you used ww_mutex_lock to acquire, -+ * but released with a normal mutex_unlock in this call. -+ * -+ * This should never happen, always use ww_mutex_unlock. -+ */ -+ DEBUG_LOCKS_WARN_ON(ww->ctx); -+ -+ /* -+ * Not quite done after calling ww_acquire_done() ? -+ */ -+ DEBUG_LOCKS_WARN_ON(ww_ctx->done_acquire); -+ -+ if (ww_ctx->contending_lock) { -+ /* -+ * After -EDEADLK you tried to -+ * acquire a different ww_mutex? Bad! -+ */ -+ DEBUG_LOCKS_WARN_ON(ww_ctx->contending_lock != ww); -+ -+ /* -+ * You called ww_mutex_lock after receiving -EDEADLK, -+ * but 'forgot' to unlock everything else first? -+ */ -+ DEBUG_LOCKS_WARN_ON(ww_ctx->acquired > 0); -+ ww_ctx->contending_lock = NULL; -+ } -+ -+ /* -+ * Naughty, using a different class will lead to undefined behavior! -+ */ -+ DEBUG_LOCKS_WARN_ON(ww_ctx->ww_class != ww->ww_class); -+#endif -+ ww_ctx->acquired++; -+} -+ -+#ifdef CONFIG_PREEMPT_RT -+static void ww_mutex_account_lock(struct rt_mutex *lock, -+ struct ww_acquire_ctx *ww_ctx) -+{ -+ struct ww_mutex *ww = container_of(lock, struct ww_mutex, base.lock); -+ struct rt_mutex_waiter *waiter, *n; -+ -+ /* -+ * This branch gets optimized out for the common case, -+ * and is only important for ww_mutex_lock. -+ */ -+ ww_mutex_lock_acquired(ww, ww_ctx); -+ ww->ctx = ww_ctx; -+ -+ /* -+ * Give any possible sleeping processes the chance to wake up, -+ * so they can recheck if they have to back off. -+ */ -+ rbtree_postorder_for_each_entry_safe(waiter, n, &lock->waiters.rb_root, -+ tree_entry) { -+ /* XXX debug rt mutex waiter wakeup */ -+ -+ BUG_ON(waiter->lock != lock); -+ rt_mutex_wake_waiter(waiter); -+ } -+} -+ -+#else -+ -+static void ww_mutex_account_lock(struct rt_mutex *lock, -+ struct ww_acquire_ctx *ww_ctx) -+{ -+ BUG(); -+} -+#endif -+ - int __sched rt_mutex_slowlock_locked(struct rt_mutex *lock, int state, - struct hrtimer_sleeper *timeout, - enum rtmutex_chainwalk chwalk, -+ struct ww_acquire_ctx *ww_ctx, - struct rt_mutex_waiter *waiter) - { - int ret; - -+#ifdef CONFIG_PREEMPT_RT -+ if (ww_ctx) { -+ struct ww_mutex *ww; -+ -+ ww = container_of(lock, struct ww_mutex, base.lock); -+ if (unlikely(ww_ctx == READ_ONCE(ww->ctx))) -+ return -EALREADY; -+ } -+#endif -+ - /* Try to acquire the lock again: */ -- if (try_to_take_rt_mutex(lock, current, NULL)) -+ if (try_to_take_rt_mutex(lock, current, NULL)) { -+ if (ww_ctx) -+ ww_mutex_account_lock(lock, ww_ctx); - return 0; -+ } - - set_current_state(state); - -@@ -1577,14 +1709,24 @@ int __sched rt_mutex_slowlock_locked(struct rt_mutex *lock, int state, - - ret = task_blocks_on_rt_mutex(lock, waiter, current, chwalk); - -- if (likely(!ret)) -+ if (likely(!ret)) { - /* sleep on the mutex */ -- ret = __rt_mutex_slowlock(lock, state, timeout, waiter); -+ ret = __rt_mutex_slowlock(lock, state, timeout, waiter, -+ ww_ctx); -+ } else if (ww_ctx) { -+ /* ww_mutex received EDEADLK, let it become EALREADY */ -+ ret = __mutex_lock_check_stamp(lock, ww_ctx); -+ BUG_ON(!ret); -+ } - - if (unlikely(ret)) { - __set_current_state(TASK_RUNNING); - remove_waiter(lock, waiter); -- rt_mutex_handle_deadlock(ret, chwalk, waiter); -+ /* ww_mutex wants to report EDEADLK/EALREADY, let it */ -+ if (!ww_ctx) -+ rt_mutex_handle_deadlock(ret, chwalk, waiter); -+ } else if (ww_ctx) { -+ ww_mutex_account_lock(lock, ww_ctx); - } - - /* -@@ -1601,7 +1743,8 @@ int __sched rt_mutex_slowlock_locked(struct rt_mutex *lock, int state, - static int __sched - rt_mutex_slowlock(struct rt_mutex *lock, int state, - struct hrtimer_sleeper *timeout, -- enum rtmutex_chainwalk chwalk) -+ enum rtmutex_chainwalk chwalk, -+ struct ww_acquire_ctx *ww_ctx) - { - struct rt_mutex_waiter waiter; - unsigned long flags; -@@ -1619,7 +1762,8 @@ rt_mutex_slowlock(struct rt_mutex *lock, int state, - */ - raw_spin_lock_irqsave(&lock->wait_lock, flags); - -- ret = rt_mutex_slowlock_locked(lock, state, timeout, chwalk, &waiter); -+ ret = rt_mutex_slowlock_locked(lock, state, timeout, chwalk, ww_ctx, -+ &waiter); - - raw_spin_unlock_irqrestore(&lock->wait_lock, flags); - -@@ -1749,14 +1893,16 @@ static bool __sched rt_mutex_slowunlock(struct rt_mutex *lock, - */ - static inline int - rt_mutex_fastlock(struct rt_mutex *lock, int state, -+ struct ww_acquire_ctx *ww_ctx, - int (*slowfn)(struct rt_mutex *lock, int state, - struct hrtimer_sleeper *timeout, -- enum rtmutex_chainwalk chwalk)) -+ enum rtmutex_chainwalk chwalk, -+ struct ww_acquire_ctx *ww_ctx)) - { - if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current))) - return 0; - -- return slowfn(lock, state, NULL, RT_MUTEX_MIN_CHAINWALK); -+ return slowfn(lock, state, NULL, RT_MUTEX_MIN_CHAINWALK, ww_ctx); - } - - static inline int -@@ -1801,7 +1947,7 @@ rt_mutex_fastunlock(struct rt_mutex *lock, - int __sched __rt_mutex_lock_state(struct rt_mutex *lock, int state) - { - might_sleep(); -- return rt_mutex_fastlock(lock, state, rt_mutex_slowlock); -+ return rt_mutex_fastlock(lock, state, NULL, rt_mutex_slowlock); - } - - /** -@@ -2245,7 +2391,7 @@ int rt_mutex_wait_proxy_lock(struct rt_mutex *lock, - raw_spin_lock_irq(&lock->wait_lock); - /* sleep on the mutex */ - set_current_state(TASK_INTERRUPTIBLE); -- ret = __rt_mutex_slowlock(lock, TASK_INTERRUPTIBLE, to, waiter); -+ ret = __rt_mutex_slowlock(lock, TASK_INTERRUPTIBLE, to, waiter, NULL); - /* - * try_to_take_rt_mutex() sets the waiter bit unconditionally. We might - * have to fix that up. -@@ -2315,3 +2461,97 @@ bool rt_mutex_cleanup_proxy_lock(struct rt_mutex *lock, - - return cleanup; - } -+ -+static inline int -+ww_mutex_deadlock_injection(struct ww_mutex *lock, struct ww_acquire_ctx *ctx) -+{ -+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH -+ unsigned int tmp; -+ -+ if (ctx->deadlock_inject_countdown-- == 0) { -+ tmp = ctx->deadlock_inject_interval; -+ if (tmp > UINT_MAX/4) -+ tmp = UINT_MAX; -+ else -+ tmp = tmp*2 + tmp + tmp/2; -+ -+ ctx->deadlock_inject_interval = tmp; -+ ctx->deadlock_inject_countdown = tmp; -+ ctx->contending_lock = lock; -+ -+ ww_mutex_unlock(lock); -+ -+ return -EDEADLK; -+ } -+#endif -+ -+ return 0; -+} -+ -+#ifdef CONFIG_PREEMPT_RT -+int __sched -+ww_mutex_lock_interruptible(struct ww_mutex *lock, struct ww_acquire_ctx *ctx) -+{ -+ int ret; -+ -+ might_sleep(); -+ -+ mutex_acquire_nest(&lock->base.dep_map, 0, 0, -+ ctx ? &ctx->dep_map : NULL, _RET_IP_); -+ ret = rt_mutex_slowlock(&lock->base.lock, TASK_INTERRUPTIBLE, NULL, 0, -+ ctx); -+ if (ret) -+ mutex_release(&lock->base.dep_map, _RET_IP_); -+ else if (!ret && ctx && ctx->acquired > 1) -+ return ww_mutex_deadlock_injection(lock, ctx); -+ -+ return ret; -+} -+EXPORT_SYMBOL_GPL(ww_mutex_lock_interruptible); -+ -+int __sched -+ww_mutex_lock(struct ww_mutex *lock, struct ww_acquire_ctx *ctx) -+{ -+ int ret; -+ -+ might_sleep(); -+ -+ mutex_acquire_nest(&lock->base.dep_map, 0, 0, -+ ctx ? &ctx->dep_map : NULL, _RET_IP_); -+ ret = rt_mutex_slowlock(&lock->base.lock, TASK_UNINTERRUPTIBLE, NULL, 0, -+ ctx); -+ if (ret) -+ mutex_release(&lock->base.dep_map, _RET_IP_); -+ else if (!ret && ctx && ctx->acquired > 1) -+ return ww_mutex_deadlock_injection(lock, ctx); -+ -+ return ret; -+} -+EXPORT_SYMBOL_GPL(ww_mutex_lock); -+ -+void __sched ww_mutex_unlock(struct ww_mutex *lock) -+{ -+ /* -+ * The unlocking fastpath is the 0->1 transition from 'locked' -+ * into 'unlocked' state: -+ */ -+ if (lock->ctx) { -+#ifdef CONFIG_DEBUG_MUTEXES -+ DEBUG_LOCKS_WARN_ON(!lock->ctx->acquired); -+#endif -+ if (lock->ctx->acquired > 0) -+ lock->ctx->acquired--; -+ lock->ctx = NULL; -+ } -+ -+ mutex_release(&lock->base.dep_map, _RET_IP_); -+ __rt_mutex_unlock(&lock->base.lock); -+} -+EXPORT_SYMBOL(ww_mutex_unlock); -+ -+int __rt_mutex_owner_current(struct rt_mutex *lock) -+{ -+ return rt_mutex_owner(lock) == current; -+} -+EXPORT_SYMBOL(__rt_mutex_owner_current); -+#endif -diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_common.h -index c1a280167e3c..248a7d91583b 100644 ---- a/kernel/locking/rtmutex_common.h -+++ b/kernel/locking/rtmutex_common.h -@@ -159,6 +159,7 @@ extern void rt_mutex_postunlock(struct wake_q_head *wake_q, - struct wake_q_head *wake_sleeper_q); - - /* RW semaphore special interface */ -+struct ww_acquire_ctx; - - extern int __rt_mutex_lock_state(struct rt_mutex *lock, int state); - extern int __rt_mutex_trylock(struct rt_mutex *lock); -@@ -166,6 +167,7 @@ extern void __rt_mutex_unlock(struct rt_mutex *lock); - int __sched rt_mutex_slowlock_locked(struct rt_mutex *lock, int state, - struct hrtimer_sleeper *timeout, - enum rtmutex_chainwalk chwalk, -+ struct ww_acquire_ctx *ww_ctx, - struct rt_mutex_waiter *waiter); - void __sched rt_spin_lock_slowlock_locked(struct rt_mutex *lock, - struct rt_mutex_waiter *waiter, -diff --git a/kernel/locking/rwsem-rt.c b/kernel/locking/rwsem-rt.c -index a0771c150041..274172d5bb3a 100644 ---- a/kernel/locking/rwsem-rt.c -+++ b/kernel/locking/rwsem-rt.c -@@ -138,7 +138,7 @@ static int __sched __down_read_common(struct rw_semaphore *sem, int state) - */ - rt_mutex_init_waiter(&waiter, false); - ret = rt_mutex_slowlock_locked(m, state, NULL, RT_MUTEX_MIN_CHAINWALK, -- &waiter); -+ NULL, &waiter); - /* - * The slowlock() above is guaranteed to return with the rtmutex (for - * ret = 0) is now held, so there can't be a writer active. Increment --- -2.30.2 - diff --git a/debian/patches-rt/0180-locking-rtmutex-Use-custom-scheduling-function-for-s.patch b/debian/patches-rt/0180-locking-rtmutex-Use-custom-scheduling-function-for-s.patch deleted file mode 100644 index d9764e52c..000000000 --- a/debian/patches-rt/0180-locking-rtmutex-Use-custom-scheduling-function-for-s.patch +++ /dev/null @@ -1,243 +0,0 @@ -From 908eace8f12414fd55e84e4281eb7dccc73977b9 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Tue, 6 Oct 2020 13:07:17 +0200 -Subject: [PATCH 180/296] locking/rtmutex: Use custom scheduling function for - spin-schedule() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -PREEMPT_RT builds the rwsem, mutex, spinlock and rwlock typed locks on -top of a rtmutex lock. While blocked task->pi_blocked_on is set -(tsk_is_pi_blocked()) and task needs to schedule away while waiting. - -The schedule process must distinguish between blocking on a regular -sleeping lock (rwsem and mutex) and a RT-only sleeping lock (spinlock -and rwlock): -- rwsem and mutex must flush block requests (blk_schedule_flush_plug()) - even if blocked on a lock. This can not deadlock because this also - happens for non-RT. - There should be a warning if the scheduling point is within a RCU read - section. - -- spinlock and rwlock must not flush block requests. This will deadlock - if the callback attempts to acquire a lock which is already acquired. - Similarly to being preempted, there should be no warning if the - scheduling point is within a RCU read section. - -Add preempt_schedule_lock() which is invoked if scheduling is required -while blocking on a PREEMPT_RT-only sleeping lock. -Remove tsk_is_pi_blocked() from the scheduler path which is no longer -needed with the additional scheduler entry point. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/arm64/include/asm/preempt.h | 3 +++ - arch/x86/include/asm/preempt.h | 3 +++ - include/asm-generic/preempt.h | 3 +++ - include/linux/sched/rt.h | 8 -------- - kernel/locking/rtmutex.c | 2 +- - kernel/locking/rwlock-rt.c | 2 +- - kernel/sched/core.c | 32 +++++++++++++++++++++----------- - 7 files changed, 32 insertions(+), 21 deletions(-) - -diff --git a/arch/arm64/include/asm/preempt.h b/arch/arm64/include/asm/preempt.h -index 80e946b2abee..f06a23898540 100644 ---- a/arch/arm64/include/asm/preempt.h -+++ b/arch/arm64/include/asm/preempt.h -@@ -81,6 +81,9 @@ static inline bool should_resched(int preempt_offset) - - #ifdef CONFIG_PREEMPTION - void preempt_schedule(void); -+#ifdef CONFIG_PREEMPT_RT -+void preempt_schedule_lock(void); -+#endif - #define __preempt_schedule() preempt_schedule() - void preempt_schedule_notrace(void); - #define __preempt_schedule_notrace() preempt_schedule_notrace() -diff --git a/arch/x86/include/asm/preempt.h b/arch/x86/include/asm/preempt.h -index 69485ca13665..5ef93c81b274 100644 ---- a/arch/x86/include/asm/preempt.h -+++ b/arch/x86/include/asm/preempt.h -@@ -103,6 +103,9 @@ static __always_inline bool should_resched(int preempt_offset) - } - - #ifdef CONFIG_PREEMPTION -+#ifdef CONFIG_PREEMPT_RT -+ extern void preempt_schedule_lock(void); -+#endif - extern asmlinkage void preempt_schedule_thunk(void); - # define __preempt_schedule() \ - asm volatile ("call preempt_schedule_thunk" : ASM_CALL_CONSTRAINT) -diff --git a/include/asm-generic/preempt.h b/include/asm-generic/preempt.h -index d683f5e6d791..71c1535db56a 100644 ---- a/include/asm-generic/preempt.h -+++ b/include/asm-generic/preempt.h -@@ -79,6 +79,9 @@ static __always_inline bool should_resched(int preempt_offset) - } - - #ifdef CONFIG_PREEMPTION -+#ifdef CONFIG_PREEMPT_RT -+extern void preempt_schedule_lock(void); -+#endif - extern asmlinkage void preempt_schedule(void); - #define __preempt_schedule() preempt_schedule() - extern asmlinkage void preempt_schedule_notrace(void); -diff --git a/include/linux/sched/rt.h b/include/linux/sched/rt.h -index e5af028c08b4..994c25640e15 100644 ---- a/include/linux/sched/rt.h -+++ b/include/linux/sched/rt.h -@@ -39,20 +39,12 @@ static inline struct task_struct *rt_mutex_get_top_task(struct task_struct *p) - } - extern void rt_mutex_setprio(struct task_struct *p, struct task_struct *pi_task); - extern void rt_mutex_adjust_pi(struct task_struct *p); --static inline bool tsk_is_pi_blocked(struct task_struct *tsk) --{ -- return tsk->pi_blocked_on != NULL; --} - #else - static inline struct task_struct *rt_mutex_get_top_task(struct task_struct *task) - { - return NULL; - } - # define rt_mutex_adjust_pi(p) do { } while (0) --static inline bool tsk_is_pi_blocked(struct task_struct *tsk) --{ -- return false; --} - #endif - - extern void normalize_rt_tasks(void); -diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c -index 50a6dd124746..4e8f73d9319d 100644 ---- a/kernel/locking/rtmutex.c -+++ b/kernel/locking/rtmutex.c -@@ -1067,7 +1067,7 @@ void __sched rt_spin_lock_slowlock_locked(struct rt_mutex *lock, - raw_spin_unlock_irqrestore(&lock->wait_lock, flags); - - if (top_waiter != waiter || adaptive_wait(lock, lock_owner)) -- schedule(); -+ preempt_schedule_lock(); - - raw_spin_lock_irqsave(&lock->wait_lock, flags); - -diff --git a/kernel/locking/rwlock-rt.c b/kernel/locking/rwlock-rt.c -index 1ee16b8fedd7..16be7111aae7 100644 ---- a/kernel/locking/rwlock-rt.c -+++ b/kernel/locking/rwlock-rt.c -@@ -211,7 +211,7 @@ static void __write_rt_lock(struct rt_rw_lock *lock) - raw_spin_unlock_irqrestore(&m->wait_lock, flags); - - if (atomic_read(&lock->readers) != 0) -- schedule(); -+ preempt_schedule_lock(); - - raw_spin_lock_irqsave(&m->wait_lock, flags); - -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index 4f197e699690..a7005a5dea68 100644 ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -4978,7 +4978,7 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) - * - * WARNING: must be called with preemption disabled! - */ --static void __sched notrace __schedule(bool preempt) -+static void __sched notrace __schedule(bool preempt, bool spinning_lock) - { - struct task_struct *prev, *next; - unsigned long *switch_count; -@@ -5031,7 +5031,7 @@ static void __sched notrace __schedule(bool preempt) - * - ptrace_{,un}freeze_traced() can change ->state underneath us. - */ - prev_state = prev->state; -- if (!preempt && prev_state) { -+ if ((!preempt || spinning_lock) && prev_state) { - if (signal_pending_state(prev_state, prev)) { - prev->state = TASK_RUNNING; - } else { -@@ -5115,7 +5115,7 @@ void __noreturn do_task_dead(void) - /* Tell freezer to ignore us: */ - current->flags |= PF_NOFREEZE; - -- __schedule(false); -+ __schedule(false, false); - BUG(); - - /* Avoid "noreturn function does return" - but don't continue if BUG() is a NOP: */ -@@ -5148,9 +5148,6 @@ static inline void sched_submit_work(struct task_struct *tsk) - preempt_enable_no_resched(); - } - -- if (tsk_is_pi_blocked(tsk)) -- return; -- - /* - * If we are going to sleep and we have plugged IO queued, - * make sure to submit it to avoid deadlocks. -@@ -5176,7 +5173,7 @@ asmlinkage __visible void __sched schedule(void) - sched_submit_work(tsk); - do { - preempt_disable(); -- __schedule(false); -+ __schedule(false, false); - sched_preempt_enable_no_resched(); - } while (need_resched()); - sched_update_worker(tsk); -@@ -5204,7 +5201,7 @@ void __sched schedule_idle(void) - */ - WARN_ON_ONCE(current->state); - do { -- __schedule(false); -+ __schedule(false, false); - } while (need_resched()); - } - -@@ -5257,7 +5254,7 @@ static void __sched notrace preempt_schedule_common(void) - */ - preempt_disable_notrace(); - preempt_latency_start(1); -- __schedule(true); -+ __schedule(true, false); - preempt_latency_stop(1); - preempt_enable_no_resched_notrace(); - -@@ -5287,6 +5284,19 @@ asmlinkage __visible void __sched notrace preempt_schedule(void) - NOKPROBE_SYMBOL(preempt_schedule); - EXPORT_SYMBOL(preempt_schedule); - -+#ifdef CONFIG_PREEMPT_RT -+void __sched notrace preempt_schedule_lock(void) -+{ -+ do { -+ preempt_disable(); -+ __schedule(true, true); -+ sched_preempt_enable_no_resched(); -+ } while (need_resched()); -+} -+NOKPROBE_SYMBOL(preempt_schedule_lock); -+EXPORT_SYMBOL(preempt_schedule_lock); -+#endif -+ - /** - * preempt_schedule_notrace - preempt_schedule called by tracing - * -@@ -5330,7 +5340,7 @@ asmlinkage __visible void __sched notrace preempt_schedule_notrace(void) - * an infinite recursion. - */ - prev_ctx = exception_enter(); -- __schedule(true); -+ __schedule(true, false); - exception_exit(prev_ctx); - - preempt_latency_stop(1); -@@ -5359,7 +5369,7 @@ asmlinkage __visible void __sched preempt_schedule_irq(void) - do { - preempt_disable(); - local_irq_enable(); -- __schedule(true); -+ __schedule(true, false); - local_irq_disable(); - sched_preempt_enable_no_resched(); - } while (need_resched()); --- -2.30.2 - diff --git a/debian/patches-rt/0182-preempt-Provide-preempt_-_-no-rt-variants.patch b/debian/patches-rt/0182-preempt-Provide-preempt_-_-no-rt-variants.patch deleted file mode 100644 index ab1040ea4..000000000 --- a/debian/patches-rt/0182-preempt-Provide-preempt_-_-no-rt-variants.patch +++ /dev/null @@ -1,53 +0,0 @@ -From eeb1c212d82ca200f85df5e71e2ae766be7c8464 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Fri, 24 Jul 2009 12:38:56 +0200 -Subject: [PATCH 182/296] preempt: Provide preempt_*_(no)rt variants -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -RT needs a few preempt_disable/enable points which are not necessary -otherwise. Implement variants to avoid #ifdeffery. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - include/linux/preempt.h | 18 +++++++++++++++++- - 1 file changed, 17 insertions(+), 1 deletion(-) - -diff --git a/include/linux/preempt.h b/include/linux/preempt.h -index 4d244e295e85..5ceac863e729 100644 ---- a/include/linux/preempt.h -+++ b/include/linux/preempt.h -@@ -188,7 +188,11 @@ do { \ - preempt_count_dec(); \ - } while (0) - --#define preempt_enable_no_resched() sched_preempt_enable_no_resched() -+#ifdef CONFIG_PREEMPT_RT -+# define preempt_enable_no_resched() sched_preempt_enable_no_resched() -+#else -+# define preempt_enable_no_resched() preempt_enable() -+#endif - - #define preemptible() (preempt_count() == 0 && !irqs_disabled()) - -@@ -282,6 +286,18 @@ do { \ - set_preempt_need_resched(); \ - } while (0) - -+#ifdef CONFIG_PREEMPT_RT -+# define preempt_disable_rt() preempt_disable() -+# define preempt_enable_rt() preempt_enable() -+# define preempt_disable_nort() barrier() -+# define preempt_enable_nort() barrier() -+#else -+# define preempt_disable_rt() barrier() -+# define preempt_enable_rt() barrier() -+# define preempt_disable_nort() preempt_disable() -+# define preempt_enable_nort() preempt_enable() -+#endif -+ - #ifdef CONFIG_PREEMPT_NOTIFIERS - - struct preempt_notifier; --- -2.30.2 - diff --git a/debian/patches-rt/0183-mm-vmstat-Protect-per-cpu-variables-with-preempt-dis.patch b/debian/patches-rt/0183-mm-vmstat-Protect-per-cpu-variables-with-preempt-dis.patch deleted file mode 100644 index 45c0b4a18..000000000 --- a/debian/patches-rt/0183-mm-vmstat-Protect-per-cpu-variables-with-preempt-dis.patch +++ /dev/null @@ -1,145 +0,0 @@ -From fed9ec55a9c8af6f19df5927def3625955a40b06 Mon Sep 17 00:00:00 2001 -From: Ingo Molnar <mingo@elte.hu> -Date: Fri, 3 Jul 2009 08:30:13 -0500 -Subject: [PATCH 183/296] mm/vmstat: Protect per cpu variables with preempt - disable on RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Disable preemption on -RT for the vmstat code. On vanila the code runs in -IRQ-off regions while on -RT it is not. "preempt_disable" ensures that the -same ressources is not updated in parallel due to preemption. - -Signed-off-by: Ingo Molnar <mingo@elte.hu> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - include/linux/vmstat.h | 4 ++++ - mm/vmstat.c | 12 ++++++++++++ - 2 files changed, 16 insertions(+) - -diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h -index 322dcbfcc933..9a3a10ea3e3c 100644 ---- a/include/linux/vmstat.h -+++ b/include/linux/vmstat.h -@@ -63,7 +63,9 @@ DECLARE_PER_CPU(struct vm_event_state, vm_event_states); - */ - static inline void __count_vm_event(enum vm_event_item item) - { -+ preempt_disable_rt(); - raw_cpu_inc(vm_event_states.event[item]); -+ preempt_enable_rt(); - } - - static inline void count_vm_event(enum vm_event_item item) -@@ -73,7 +75,9 @@ static inline void count_vm_event(enum vm_event_item item) - - static inline void __count_vm_events(enum vm_event_item item, long delta) - { -+ preempt_disable_rt(); - raw_cpu_add(vm_event_states.event[item], delta); -+ preempt_enable_rt(); - } - - static inline void count_vm_events(enum vm_event_item item, long delta) -diff --git a/mm/vmstat.c b/mm/vmstat.c -index 698bc0bc18d1..1e7688a43887 100644 ---- a/mm/vmstat.c -+++ b/mm/vmstat.c -@@ -321,6 +321,7 @@ void __mod_zone_page_state(struct zone *zone, enum zone_stat_item item, - long x; - long t; - -+ preempt_disable_rt(); - x = delta + __this_cpu_read(*p); - - t = __this_cpu_read(pcp->stat_threshold); -@@ -330,6 +331,7 @@ void __mod_zone_page_state(struct zone *zone, enum zone_stat_item item, - x = 0; - } - __this_cpu_write(*p, x); -+ preempt_enable_rt(); - } - EXPORT_SYMBOL(__mod_zone_page_state); - -@@ -346,6 +348,7 @@ void __mod_node_page_state(struct pglist_data *pgdat, enum node_stat_item item, - delta >>= PAGE_SHIFT; - } - -+ preempt_disable_rt(); - x = delta + __this_cpu_read(*p); - - t = __this_cpu_read(pcp->stat_threshold); -@@ -355,6 +358,7 @@ void __mod_node_page_state(struct pglist_data *pgdat, enum node_stat_item item, - x = 0; - } - __this_cpu_write(*p, x); -+ preempt_enable_rt(); - } - EXPORT_SYMBOL(__mod_node_page_state); - -@@ -387,6 +391,7 @@ void __inc_zone_state(struct zone *zone, enum zone_stat_item item) - s8 __percpu *p = pcp->vm_stat_diff + item; - s8 v, t; - -+ preempt_disable_rt(); - v = __this_cpu_inc_return(*p); - t = __this_cpu_read(pcp->stat_threshold); - if (unlikely(v > t)) { -@@ -395,6 +400,7 @@ void __inc_zone_state(struct zone *zone, enum zone_stat_item item) - zone_page_state_add(v + overstep, zone, item); - __this_cpu_write(*p, -overstep); - } -+ preempt_enable_rt(); - } - - void __inc_node_state(struct pglist_data *pgdat, enum node_stat_item item) -@@ -405,6 +411,7 @@ void __inc_node_state(struct pglist_data *pgdat, enum node_stat_item item) - - VM_WARN_ON_ONCE(vmstat_item_in_bytes(item)); - -+ preempt_disable_rt(); - v = __this_cpu_inc_return(*p); - t = __this_cpu_read(pcp->stat_threshold); - if (unlikely(v > t)) { -@@ -413,6 +420,7 @@ void __inc_node_state(struct pglist_data *pgdat, enum node_stat_item item) - node_page_state_add(v + overstep, pgdat, item); - __this_cpu_write(*p, -overstep); - } -+ preempt_enable_rt(); - } - - void __inc_zone_page_state(struct page *page, enum zone_stat_item item) -@@ -433,6 +441,7 @@ void __dec_zone_state(struct zone *zone, enum zone_stat_item item) - s8 __percpu *p = pcp->vm_stat_diff + item; - s8 v, t; - -+ preempt_disable_rt(); - v = __this_cpu_dec_return(*p); - t = __this_cpu_read(pcp->stat_threshold); - if (unlikely(v < - t)) { -@@ -441,6 +450,7 @@ void __dec_zone_state(struct zone *zone, enum zone_stat_item item) - zone_page_state_add(v - overstep, zone, item); - __this_cpu_write(*p, overstep); - } -+ preempt_enable_rt(); - } - - void __dec_node_state(struct pglist_data *pgdat, enum node_stat_item item) -@@ -451,6 +461,7 @@ void __dec_node_state(struct pglist_data *pgdat, enum node_stat_item item) - - VM_WARN_ON_ONCE(vmstat_item_in_bytes(item)); - -+ preempt_disable_rt(); - v = __this_cpu_dec_return(*p); - t = __this_cpu_read(pcp->stat_threshold); - if (unlikely(v < - t)) { -@@ -459,6 +470,7 @@ void __dec_node_state(struct pglist_data *pgdat, enum node_stat_item item) - node_page_state_add(v - overstep, pgdat, item); - __this_cpu_write(*p, overstep); - } -+ preempt_enable_rt(); - } - - void __dec_zone_page_state(struct page *page, enum zone_stat_item item) --- -2.30.2 - diff --git a/debian/patches-rt/0184-mm-memcontrol-Disable-preemption-in-__mod_memcg_lruv.patch b/debian/patches-rt/0184-mm-memcontrol-Disable-preemption-in-__mod_memcg_lruv.patch deleted file mode 100644 index 06badc00d..000000000 --- a/debian/patches-rt/0184-mm-memcontrol-Disable-preemption-in-__mod_memcg_lruv.patch +++ /dev/null @@ -1,44 +0,0 @@ -From 8a90f4a5c97101242b45ec3b689761d336e1346a Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Wed, 28 Oct 2020 18:15:32 +0100 -Subject: [PATCH 184/296] mm/memcontrol: Disable preemption in - __mod_memcg_lruvec_state() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The callers expect disabled preemption/interrupts while invoking -__mod_memcg_lruvec_state(). This works mainline because a lock of -somekind is acquired. - -Use preempt_disable_rt() where per-CPU variables are accessed and a -stable pointer is expected. This is also done in __mod_zone_page_state() -for the same reason. - -Cc: stable-rt@vger.kernel.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - mm/memcontrol.c | 2 ++ - 1 file changed, 2 insertions(+) - -diff --git a/mm/memcontrol.c b/mm/memcontrol.c -index d72d2b90474a..bc99242d3415 100644 ---- a/mm/memcontrol.c -+++ b/mm/memcontrol.c -@@ -816,6 +816,7 @@ void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx, - pn = container_of(lruvec, struct mem_cgroup_per_node, lruvec); - memcg = pn->memcg; - -+ preempt_disable_rt(); - /* Update memcg */ - __mod_memcg_state(memcg, idx, val); - -@@ -835,6 +836,7 @@ void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx, - x = 0; - } - __this_cpu_write(pn->lruvec_stat_cpu->count[idx], x); -+ preempt_enable_rt(); - } - - /** --- -2.30.2 - diff --git a/debian/patches-rt/0185-xfrm-Use-sequence-counter-with-associated-spinlock.patch b/debian/patches-rt/0185-xfrm-Use-sequence-counter-with-associated-spinlock.patch deleted file mode 100644 index d9f63af32..000000000 --- a/debian/patches-rt/0185-xfrm-Use-sequence-counter-with-associated-spinlock.patch +++ /dev/null @@ -1,46 +0,0 @@ -From f395bddba04525619e9305a58c86554f8e8d1325 Mon Sep 17 00:00:00 2001 -From: "Ahmed S. Darwish" <a.darwish@linutronix.de> -Date: Wed, 10 Jun 2020 12:53:22 +0200 -Subject: [PATCH 185/296] xfrm: Use sequence counter with associated spinlock -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -A sequence counter write side critical section must be protected by some -form of locking to serialize writers. A plain seqcount_t does not -contain the information of which lock must be held when entering a write -side critical section. - -Use the new seqcount_spinlock_t data type, which allows to associate a -spinlock with the sequence counter. This enables lockdep to verify that -the spinlock used for writer serialization is held when the write side -critical section is entered. - -If lockdep is disabled this lock association is compiled out and has -neither storage size nor runtime overhead. - -Upstream-status: The xfrm locking used for seqcoun writer serialization -appears to be broken. If that's the case, a proper fix will need to be -submitted upstream. (e.g. make the seqcount per network namespace?) - -Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - net/xfrm/xfrm_state.c | 3 ++- - 1 file changed, 2 insertions(+), 1 deletion(-) - -diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c -index 77499abd9f99..7a2840d53654 100644 ---- a/net/xfrm/xfrm_state.c -+++ b/net/xfrm/xfrm_state.c -@@ -2663,7 +2663,8 @@ int __net_init xfrm_state_init(struct net *net) - net->xfrm.state_num = 0; - INIT_WORK(&net->xfrm.state_hash_work, xfrm_hash_resize); - spin_lock_init(&net->xfrm.xfrm_state_lock); -- seqcount_init(&net->xfrm.xfrm_state_hash_generation); -+ seqcount_spinlock_init(&net->xfrm.xfrm_state_hash_generation, -+ &net->xfrm.xfrm_state_lock); - return 0; - - out_byspi: --- -2.30.2 - diff --git a/debian/patches-rt/0188-fs-dcache-disable-preemption-on-i_dir_seq-s-write-si.patch b/debian/patches-rt/0188-fs-dcache-disable-preemption-on-i_dir_seq-s-write-si.patch deleted file mode 100644 index f3b73cb60..000000000 --- a/debian/patches-rt/0188-fs-dcache-disable-preemption-on-i_dir_seq-s-write-si.patch +++ /dev/null @@ -1,99 +0,0 @@ -From 5507b749a51dbc791ae0211a0d0393533ee12427 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Fri, 20 Oct 2017 11:29:53 +0200 -Subject: [PATCH 188/296] fs/dcache: disable preemption on i_dir_seq's write - side -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -i_dir_seq is an opencoded seqcounter. Based on the code it looks like we -could have two writers in parallel despite the fact that the d_lock is -held. The problem is that during the write process on RT the preemption -is still enabled and if this process is interrupted by a reader with RT -priority then we lock up. -To avoid that lock up I am disabling the preemption during the update. -The rename of i_dir_seq is here to ensure to catch new write sides in -future. - -Cc: stable-rt@vger.kernel.org -Reported-by: Oleg.Karfich@wago.com -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - fs/dcache.c | 12 +++++++----- - fs/inode.c | 2 +- - include/linux/fs.h | 2 +- - 3 files changed, 9 insertions(+), 7 deletions(-) - -diff --git a/fs/dcache.c b/fs/dcache.c -index 1f4255ef8722..26a187abf13a 100644 ---- a/fs/dcache.c -+++ b/fs/dcache.c -@@ -2503,9 +2503,10 @@ EXPORT_SYMBOL(d_rehash); - static inline unsigned start_dir_add(struct inode *dir) - { - -+ preempt_disable_rt(); - for (;;) { -- unsigned n = dir->i_dir_seq; -- if (!(n & 1) && cmpxchg(&dir->i_dir_seq, n, n + 1) == n) -+ unsigned n = dir->__i_dir_seq; -+ if (!(n & 1) && cmpxchg(&dir->__i_dir_seq, n, n + 1) == n) - return n; - cpu_relax(); - } -@@ -2513,7 +2514,8 @@ static inline unsigned start_dir_add(struct inode *dir) - - static inline void end_dir_add(struct inode *dir, unsigned n) - { -- smp_store_release(&dir->i_dir_seq, n + 2); -+ smp_store_release(&dir->__i_dir_seq, n + 2); -+ preempt_enable_rt(); - } - - static void d_wait_lookup(struct dentry *dentry) -@@ -2549,7 +2551,7 @@ struct dentry *d_alloc_parallel(struct dentry *parent, - - retry: - rcu_read_lock(); -- seq = smp_load_acquire(&parent->d_inode->i_dir_seq); -+ seq = smp_load_acquire(&parent->d_inode->__i_dir_seq); - r_seq = read_seqbegin(&rename_lock); - dentry = __d_lookup_rcu(parent, name, &d_seq); - if (unlikely(dentry)) { -@@ -2577,7 +2579,7 @@ struct dentry *d_alloc_parallel(struct dentry *parent, - } - - hlist_bl_lock(b); -- if (unlikely(READ_ONCE(parent->d_inode->i_dir_seq) != seq)) { -+ if (unlikely(READ_ONCE(parent->d_inode->__i_dir_seq) != seq)) { - hlist_bl_unlock(b); - rcu_read_unlock(); - goto retry; -diff --git a/fs/inode.c b/fs/inode.c -index 5eea9912a0b9..85da949d9083 100644 ---- a/fs/inode.c -+++ b/fs/inode.c -@@ -158,7 +158,7 @@ int inode_init_always(struct super_block *sb, struct inode *inode) - inode->i_bdev = NULL; - inode->i_cdev = NULL; - inode->i_link = NULL; -- inode->i_dir_seq = 0; -+ inode->__i_dir_seq = 0; - inode->i_rdev = 0; - inode->dirtied_when = 0; - -diff --git a/include/linux/fs.h b/include/linux/fs.h -index 8bde32cf9711..61ad6d7bfc95 100644 ---- a/include/linux/fs.h -+++ b/include/linux/fs.h -@@ -699,7 +699,7 @@ struct inode { - struct block_device *i_bdev; - struct cdev *i_cdev; - char *i_link; -- unsigned i_dir_seq; -+ unsigned __i_dir_seq; - }; - - __u32 i_generation; --- -2.30.2 - diff --git a/debian/patches-rt/0189-net-Qdisc-use-a-seqlock-instead-seqcount.patch b/debian/patches-rt/0189-net-Qdisc-use-a-seqlock-instead-seqcount.patch deleted file mode 100644 index 2015cb161..000000000 --- a/debian/patches-rt/0189-net-Qdisc-use-a-seqlock-instead-seqcount.patch +++ /dev/null @@ -1,281 +0,0 @@ -From a3d4fb66e368e931b012a8971cde9c7cfdcc3993 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Wed, 14 Sep 2016 17:36:35 +0200 -Subject: [PATCH 189/296] net/Qdisc: use a seqlock instead seqcount -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The seqcount disables preemption on -RT while it is held which can't -remove. Also we don't want the reader to spin for ages if the writer is -scheduled out. The seqlock on the other hand will serialize / sleep on -the lock while writer is active. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/net/gen_stats.h | 11 ++++++----- - include/net/net_seq_lock.h | 24 ++++++++++++++++++++++++ - include/net/sch_generic.h | 19 +++++++++++++++++-- - net/core/gen_estimator.c | 6 +++--- - net/core/gen_stats.c | 12 ++++++------ - net/sched/sch_api.c | 2 +- - net/sched/sch_generic.c | 10 ++++++++++ - 7 files changed, 67 insertions(+), 17 deletions(-) - create mode 100644 include/net/net_seq_lock.h - ---- a/include/net/gen_stats.h -+++ b/include/net/gen_stats.h -@@ -6,6 +6,7 @@ - #include <linux/socket.h> - #include <linux/rtnetlink.h> - #include <linux/pkt_sched.h> -+#include <net/net_seq_lock.h> - - /* Note: this used to be in include/uapi/linux/gen_stats.h */ - struct gnet_stats_basic_packed { -@@ -42,15 +43,15 @@ - spinlock_t *lock, struct gnet_dump *d, - int padattr); - --int gnet_stats_copy_basic(const seqcount_t *running, -+int gnet_stats_copy_basic(net_seqlock_t *running, - struct gnet_dump *d, - struct gnet_stats_basic_cpu __percpu *cpu, - struct gnet_stats_basic_packed *b); --void __gnet_stats_copy_basic(const seqcount_t *running, -+void __gnet_stats_copy_basic(net_seqlock_t *running, - struct gnet_stats_basic_packed *bstats, - struct gnet_stats_basic_cpu __percpu *cpu, - struct gnet_stats_basic_packed *b); --int gnet_stats_copy_basic_hw(const seqcount_t *running, -+int gnet_stats_copy_basic_hw(net_seqlock_t *running, - struct gnet_dump *d, - struct gnet_stats_basic_cpu __percpu *cpu, - struct gnet_stats_basic_packed *b); -@@ -70,13 +71,13 @@ - struct gnet_stats_basic_cpu __percpu *cpu_bstats, - struct net_rate_estimator __rcu **rate_est, - spinlock_t *lock, -- seqcount_t *running, struct nlattr *opt); -+ net_seqlock_t *running, struct nlattr *opt); - void gen_kill_estimator(struct net_rate_estimator __rcu **ptr); - int gen_replace_estimator(struct gnet_stats_basic_packed *bstats, - struct gnet_stats_basic_cpu __percpu *cpu_bstats, - struct net_rate_estimator __rcu **ptr, - spinlock_t *lock, -- seqcount_t *running, struct nlattr *opt); -+ net_seqlock_t *running, struct nlattr *opt); - bool gen_estimator_active(struct net_rate_estimator __rcu **ptr); - bool gen_estimator_read(struct net_rate_estimator __rcu **ptr, - struct gnet_stats_rate_est64 *sample); ---- /dev/null -+++ b/include/net/net_seq_lock.h -@@ -0,0 +1,24 @@ -+#ifndef __NET_NET_SEQ_LOCK_H__ -+#define __NET_NET_SEQ_LOCK_H__ -+ -+#ifdef CONFIG_PREEMPT_RT -+# define net_seqlock_t seqlock_t -+# define net_seq_begin(__r) read_seqbegin(__r) -+# define net_seq_retry(__r, __s) read_seqretry(__r, __s) -+ -+static inline int try_write_seqlock(seqlock_t *sl) -+{ -+ if (spin_trylock(&sl->lock)) { -+ write_seqcount_begin(&sl->seqcount); -+ return 1; -+ } -+ return 0; -+} -+ -+#else -+# define net_seqlock_t seqcount_t -+# define net_seq_begin(__r) read_seqcount_begin(__r) -+# define net_seq_retry(__r, __s) read_seqcount_retry(__r, __s) -+#endif -+ -+#endif ---- a/include/net/sch_generic.h -+++ b/include/net/sch_generic.h -@@ -10,6 +10,7 @@ - #include <linux/percpu.h> - #include <linux/dynamic_queue_limits.h> - #include <linux/list.h> -+#include <net/net_seq_lock.h> - #include <linux/refcount.h> - #include <linux/workqueue.h> - #include <linux/mutex.h> -@@ -101,7 +102,7 @@ - struct sk_buff_head gso_skb ____cacheline_aligned_in_smp; - struct qdisc_skb_head q; - struct gnet_stats_basic_packed bstats; -- seqcount_t running; -+ net_seqlock_t running; - struct gnet_stats_queue qstats; - unsigned long state; - struct Qdisc *next_sched; -@@ -142,7 +143,11 @@ - { - if (qdisc->flags & TCQ_F_NOLOCK) - return spin_is_locked(&qdisc->seqlock); -+#ifdef CONFIG_PREEMPT_RT -+ return spin_is_locked(&qdisc->running.lock) ? true : false; -+#else - return (raw_read_seqcount(&qdisc->running) & 1) ? true : false; -+#endif - } - - static inline bool qdisc_is_percpu_stats(const struct Qdisc *q) -@@ -191,17 +196,27 @@ - } else if (qdisc_is_running(qdisc)) { - return false; - } -+#ifdef CONFIG_PREEMPT_RT -+ if (try_write_seqlock(&qdisc->running)) -+ return true; -+ return false; -+#else - /* Variant of write_seqcount_begin() telling lockdep a trylock - * was attempted. - */ - raw_write_seqcount_begin(&qdisc->running); - seqcount_acquire(&qdisc->running.dep_map, 0, 1, _RET_IP_); - return true; -+#endif - } - - static inline void qdisc_run_end(struct Qdisc *qdisc) - { -+#ifdef CONFIG_PREEMPT_RT -+ write_sequnlock(&qdisc->running); -+#else - write_seqcount_end(&qdisc->running); -+#endif - if (qdisc->flags & TCQ_F_NOLOCK) { - spin_unlock(&qdisc->seqlock); - -@@ -583,7 +598,7 @@ - return qdisc_lock(root); - } - --static inline seqcount_t *qdisc_root_sleeping_running(const struct Qdisc *qdisc) -+static inline net_seqlock_t *qdisc_root_sleeping_running(const struct Qdisc *qdisc) - { - struct Qdisc *root = qdisc_root_sleeping(qdisc); - ---- a/net/core/gen_estimator.c -+++ b/net/core/gen_estimator.c -@@ -42,7 +42,7 @@ - struct net_rate_estimator { - struct gnet_stats_basic_packed *bstats; - spinlock_t *stats_lock; -- seqcount_t *running; -+ net_seqlock_t *running; - struct gnet_stats_basic_cpu __percpu *cpu_bstats; - u8 ewma_log; - u8 intvl_log; /* period : (250ms << intvl_log) */ -@@ -125,7 +125,7 @@ - struct gnet_stats_basic_cpu __percpu *cpu_bstats, - struct net_rate_estimator __rcu **rate_est, - spinlock_t *lock, -- seqcount_t *running, -+ net_seqlock_t *running, - struct nlattr *opt) - { - struct gnet_estimator *parm = nla_data(opt); -@@ -226,7 +226,7 @@ - struct gnet_stats_basic_cpu __percpu *cpu_bstats, - struct net_rate_estimator __rcu **rate_est, - spinlock_t *lock, -- seqcount_t *running, struct nlattr *opt) -+ net_seqlock_t *running, struct nlattr *opt) - { - return gen_new_estimator(bstats, cpu_bstats, rate_est, - lock, running, opt); ---- a/net/core/gen_stats.c -+++ b/net/core/gen_stats.c -@@ -137,7 +137,7 @@ - } - - void --__gnet_stats_copy_basic(const seqcount_t *running, -+__gnet_stats_copy_basic(net_seqlock_t *running, - struct gnet_stats_basic_packed *bstats, - struct gnet_stats_basic_cpu __percpu *cpu, - struct gnet_stats_basic_packed *b) -@@ -150,15 +150,15 @@ - } - do { - if (running) -- seq = read_seqcount_begin(running); -+ seq = net_seq_begin(running); - bstats->bytes = b->bytes; - bstats->packets = b->packets; -- } while (running && read_seqcount_retry(running, seq)); -+ } while (running && net_seq_retry(running, seq)); - } - EXPORT_SYMBOL(__gnet_stats_copy_basic); - - static int --___gnet_stats_copy_basic(const seqcount_t *running, -+___gnet_stats_copy_basic(net_seqlock_t *running, - struct gnet_dump *d, - struct gnet_stats_basic_cpu __percpu *cpu, - struct gnet_stats_basic_packed *b, -@@ -204,7 +204,7 @@ - * if the room in the socket buffer was not sufficient. - */ - int --gnet_stats_copy_basic(const seqcount_t *running, -+gnet_stats_copy_basic(net_seqlock_t *running, - struct gnet_dump *d, - struct gnet_stats_basic_cpu __percpu *cpu, - struct gnet_stats_basic_packed *b) -@@ -228,7 +228,7 @@ - * if the room in the socket buffer was not sufficient. - */ - int --gnet_stats_copy_basic_hw(const seqcount_t *running, -+gnet_stats_copy_basic_hw(net_seqlock_t *running, - struct gnet_dump *d, - struct gnet_stats_basic_cpu __percpu *cpu, - struct gnet_stats_basic_packed *b) ---- a/net/sched/sch_api.c -+++ b/net/sched/sch_api.c -@@ -1258,7 +1258,7 @@ - rcu_assign_pointer(sch->stab, stab); - } - if (tca[TCA_RATE]) { -- seqcount_t *running; -+ net_seqlock_t *running; - - err = -EOPNOTSUPP; - if (sch->flags & TCQ_F_MQROOT) { ---- a/net/sched/sch_generic.c -+++ b/net/sched/sch_generic.c -@@ -578,7 +578,11 @@ - .ops = &noop_qdisc_ops, - .q.lock = __SPIN_LOCK_UNLOCKED(noop_qdisc.q.lock), - .dev_queue = &noop_netdev_queue, -+#ifdef CONFIG_PREEMPT_RT -+ .running = __SEQLOCK_UNLOCKED(noop_qdisc.running), -+#else - .running = SEQCNT_ZERO(noop_qdisc.running), -+#endif - .busylock = __SPIN_LOCK_UNLOCKED(noop_qdisc.busylock), - .gso_skb = { - .next = (struct sk_buff *)&noop_qdisc.gso_skb, -@@ -889,9 +893,15 @@ - lockdep_set_class(&sch->busylock, - dev->qdisc_tx_busylock ?: &qdisc_tx_busylock); - -+#ifdef CONFIG_PREEMPT_RT -+ seqlock_init(&sch->running); -+ lockdep_set_class(&sch->running.lock, -+ dev->qdisc_running_key ?: &qdisc_running_key); -+#else - seqcount_init(&sch->running); - lockdep_set_class(&sch->running, - dev->qdisc_running_key ?: &qdisc_running_key); -+#endif - - sch->ops = ops; - sch->flags = ops->static_flags; diff --git a/debian/patches-rt/0190-net-Properly-annotate-the-try-lock-for-the-seqlock.patch b/debian/patches-rt/0190-net-Properly-annotate-the-try-lock-for-the-seqlock.patch deleted file mode 100644 index 4b8ea61ce..000000000 --- a/debian/patches-rt/0190-net-Properly-annotate-the-try-lock-for-the-seqlock.patch +++ /dev/null @@ -1,70 +0,0 @@ -From 0e870a0801b4d99913bbbeb87d9cda926a8ed8cc Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Tue, 8 Sep 2020 16:57:11 +0200 -Subject: [PATCH 190/296] net: Properly annotate the try-lock for the seqlock -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -In patch - ("net/Qdisc: use a seqlock instead seqcount") - -the seqcount has been replaced with a seqlock to allow to reader to -boost the preempted writer. -The try_write_seqlock() acquired the lock with a try-lock but the -seqcount annotation was "lock". - -Opencode write_seqcount_t_begin() and use the try-lock annotation for -lockdep. - -Reported-by: Mike Galbraith <efault@gmx.de> -Cc: stable-rt@vger.kernel.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/net/net_seq_lock.h | 9 --------- - include/net/sch_generic.h | 10 +++++++++- - 2 files changed, 9 insertions(+), 10 deletions(-) - -diff --git a/include/net/net_seq_lock.h b/include/net/net_seq_lock.h -index 95a497a72e51..67710bace741 100644 ---- a/include/net/net_seq_lock.h -+++ b/include/net/net_seq_lock.h -@@ -6,15 +6,6 @@ - # define net_seq_begin(__r) read_seqbegin(__r) - # define net_seq_retry(__r, __s) read_seqretry(__r, __s) - --static inline int try_write_seqlock(seqlock_t *sl) --{ -- if (spin_trylock(&sl->lock)) { -- write_seqcount_begin(&sl->seqcount); -- return 1; -- } -- return 0; --} -- - #else - # define net_seqlock_t seqcount_t - # define net_seq_begin(__r) read_seqcount_begin(__r) -diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h -index 317fcb0a659c..6a0434d2c279 100644 ---- a/include/net/sch_generic.h -+++ b/include/net/sch_generic.h -@@ -171,8 +171,16 @@ static inline bool qdisc_run_begin(struct Qdisc *qdisc) - return false; - } - #ifdef CONFIG_PREEMPT_RT -- if (try_write_seqlock(&qdisc->running)) -+ if (spin_trylock(&qdisc->running.lock)) { -+ seqcount_t *s = &qdisc->running.seqcount.seqcount; -+ /* -+ * Variant of write_seqcount_t_begin() telling lockdep that a -+ * trylock was attempted. -+ */ -+ raw_write_seqcount_t_begin(s); -+ seqcount_acquire(&s->dep_map, 0, 1, _RET_IP_); - return true; -+ } - return false; - #else - /* Variant of write_seqcount_begin() telling lockdep a trylock --- -2.30.2 - diff --git a/debian/patches-rt/0191-kconfig-Disable-config-options-which-are-not-RT-comp.patch b/debian/patches-rt/0191-kconfig-Disable-config-options-which-are-not-RT-comp.patch deleted file mode 100644 index 0f9955b0c..000000000 --- a/debian/patches-rt/0191-kconfig-Disable-config-options-which-are-not-RT-comp.patch +++ /dev/null @@ -1,43 +0,0 @@ -From 3388d6ff514eea2010dc0e5176c2c0f2ea88d1d3 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Sun, 24 Jul 2011 12:11:43 +0200 -Subject: [PATCH 191/296] kconfig: Disable config options which are not RT - compatible -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Disable stuff which is known to have issues on RT - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - arch/Kconfig | 1 + - mm/Kconfig | 2 +- - 2 files changed, 2 insertions(+), 1 deletion(-) - -diff --git a/arch/Kconfig b/arch/Kconfig -index 3a036011ac8d..62fb1035e662 100644 ---- a/arch/Kconfig -+++ b/arch/Kconfig -@@ -37,6 +37,7 @@ config OPROFILE - tristate "OProfile system profiling" - depends on PROFILING - depends on HAVE_OPROFILE -+ depends on !PREEMPT_RT - select RING_BUFFER - select RING_BUFFER_ALLOW_SWAP - help -diff --git a/mm/Kconfig b/mm/Kconfig -index 8c49d09da214..c8cbcb5118b0 100644 ---- a/mm/Kconfig -+++ b/mm/Kconfig -@@ -387,7 +387,7 @@ config NOMMU_INITIAL_TRIM_EXCESS - - config TRANSPARENT_HUGEPAGE - bool "Transparent Hugepage Support" -- depends on HAVE_ARCH_TRANSPARENT_HUGEPAGE -+ depends on HAVE_ARCH_TRANSPARENT_HUGEPAGE && !PREEMPT_RT - select COMPACTION - select XARRAY_MULTI - help --- -2.30.2 - diff --git a/debian/patches-rt/0192-mm-Allow-only-SLUB-on-RT.patch b/debian/patches-rt/0192-mm-Allow-only-SLUB-on-RT.patch deleted file mode 100644 index c7fa8f35b..000000000 --- a/debian/patches-rt/0192-mm-Allow-only-SLUB-on-RT.patch +++ /dev/null @@ -1,47 +0,0 @@ -From 9cc1f11b7cb9624b27d4d8dcb35564ae006648d4 Mon Sep 17 00:00:00 2001 -From: Ingo Molnar <mingo@elte.hu> -Date: Fri, 3 Jul 2009 08:44:03 -0500 -Subject: [PATCH 192/296] mm: Allow only SLUB on RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Memory allocation disables interrupts as part of the allocation and freeing -process. For -RT it is important that this section remain short and don't -depend on the size of the request or an internal state of the memory allocator. -At the beginning the SLAB memory allocator was adopted for RT's needs and it -required substantial changes. Later, with the addition of the SLUB memory -allocator we adopted this one as well and the changes were smaller. More -important, due to the design of the SLUB allocator it performs better and its -worst case latency was smaller. In the end only SLUB remained supported. - -Disable SLAB and SLOB on -RT. Only SLUB is adopted to -RT needs. - -Signed-off-by: Ingo Molnar <mingo@elte.hu> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - init/Kconfig | 2 ++ - 1 file changed, 2 insertions(+) - -diff --git a/init/Kconfig b/init/Kconfig -index fc4c9f416fad..1f6c20e56ea9 100644 ---- a/init/Kconfig -+++ b/init/Kconfig -@@ -1895,6 +1895,7 @@ choice - - config SLAB - bool "SLAB" -+ depends on !PREEMPT_RT - select HAVE_HARDENED_USERCOPY_ALLOCATOR - help - The regular slab allocator that is established and known to work -@@ -1915,6 +1916,7 @@ config SLUB - config SLOB - depends on EXPERT - bool "SLOB (Simple Allocator)" -+ depends on !PREEMPT_RT - help - SLOB replaces the stock allocator with a drastically simpler - allocator. SLOB is generally more space efficient but --- -2.30.2 - diff --git a/debian/patches-rt/0193-sched-Disable-CONFIG_RT_GROUP_SCHED-on-RT.patch b/debian/patches-rt/0193-sched-Disable-CONFIG_RT_GROUP_SCHED-on-RT.patch deleted file mode 100644 index 10b046c71..000000000 --- a/debian/patches-rt/0193-sched-Disable-CONFIG_RT_GROUP_SCHED-on-RT.patch +++ /dev/null @@ -1,35 +0,0 @@ -From 71fdb57aabc971fb80e86daab7c7a2850e4e8a85 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Mon, 18 Jul 2011 17:03:52 +0200 -Subject: [PATCH 193/296] sched: Disable CONFIG_RT_GROUP_SCHED on RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Carsten reported problems when running: - - taskset 01 chrt -f 1 sleep 1 - -from within rc.local on a F15 machine. The task stays running and -never gets on the run queue because some of the run queues have -rt_throttled=1 which does not go away. Works nice from a ssh login -shell. Disabling CONFIG_RT_GROUP_SCHED solves that as well. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - init/Kconfig | 1 + - 1 file changed, 1 insertion(+) - -diff --git a/init/Kconfig b/init/Kconfig -index 1f6c20e56ea9..3fe140f4f0ed 100644 ---- a/init/Kconfig -+++ b/init/Kconfig -@@ -968,6 +968,7 @@ config CFS_BANDWIDTH - config RT_GROUP_SCHED - bool "Group scheduling for SCHED_RR/FIFO" - depends on CGROUP_SCHED -+ depends on !PREEMPT_RT - default n - help - This feature lets you explicitly allocate real CPU bandwidth --- -2.30.2 - diff --git a/debian/patches-rt/0194-net-core-disable-NET_RX_BUSY_POLL-on-RT.patch b/debian/patches-rt/0194-net-core-disable-NET_RX_BUSY_POLL-on-RT.patch deleted file mode 100644 index 9a353046c..000000000 --- a/debian/patches-rt/0194-net-core-disable-NET_RX_BUSY_POLL-on-RT.patch +++ /dev/null @@ -1,44 +0,0 @@ -From 82758c13489e05e6a18c7892dbb16652811199d3 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Sat, 27 May 2017 19:02:06 +0200 -Subject: [PATCH 194/296] net/core: disable NET_RX_BUSY_POLL on RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -napi_busy_loop() disables preemption and performs a NAPI poll. We can't acquire -sleeping locks with disabled preemption so we would have to work around this -and add explicit locking for synchronisation against ksoftirqd. -Without explicit synchronisation a low priority process would "own" the NAPI -state (by setting NAPIF_STATE_SCHED) and could be scheduled out (no -preempt_disable() and BH is preemptible on RT). -In case a network packages arrives then the interrupt handler would set -NAPIF_STATE_MISSED and the system would wait until the task owning the NAPI -would be scheduled in again. -Should a task with RT priority busy poll then it would consume the CPU instead -allowing tasks with lower priority to run. - -The NET_RX_BUSY_POLL is disabled by default (the system wide sysctls for -poll/read are set to zero) so disable NET_RX_BUSY_POLL on RT to avoid wrong -locking context on RT. Should this feature be considered useful on RT systems -then it could be enabled again with proper locking and synchronisation. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - net/Kconfig | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/net/Kconfig b/net/Kconfig -index d6567162c1cf..05b0f041f039 100644 ---- a/net/Kconfig -+++ b/net/Kconfig -@@ -282,7 +282,7 @@ config CGROUP_NET_CLASSID - - config NET_RX_BUSY_POLL - bool -- default y -+ default y if !PREEMPT_RT - - config BQL - bool --- -2.30.2 - diff --git a/debian/patches-rt/0197-rt-Add-local-irq-locks.patch b/debian/patches-rt/0197-rt-Add-local-irq-locks.patch deleted file mode 100644 index 370a5fb8a..000000000 --- a/debian/patches-rt/0197-rt-Add-local-irq-locks.patch +++ /dev/null @@ -1,206 +0,0 @@ -From 0fcc65835aa23a74ac1b85c87e184a9487182600 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Mon, 20 Jun 2011 09:03:47 +0200 -Subject: [PATCH 197/296] rt: Add local irq locks -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Introduce locallock. For !RT this maps to preempt_disable()/ -local_irq_disable() so there is not much that changes. For RT this will -map to a spinlock. This makes preemption possible and locked "ressource" -gets the lockdep anotation it wouldn't have otherwise. The locks are -recursive for owner == current. Also, all locks user migrate_disable() -which ensures that the task is not migrated to another CPU while the lock -is held and the owner is preempted. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - include/linux/local_lock_internal.h | 126 +++++++++++++++++++++++++--- - 1 file changed, 113 insertions(+), 13 deletions(-) - -diff --git a/include/linux/local_lock_internal.h b/include/linux/local_lock_internal.h -index 4a8795b21d77..271f911f2803 100644 ---- a/include/linux/local_lock_internal.h -+++ b/include/linux/local_lock_internal.h -@@ -7,33 +7,90 @@ - #include <linux/lockdep.h> - - typedef struct { --#ifdef CONFIG_DEBUG_LOCK_ALLOC -+#ifdef CONFIG_PREEMPT_RT -+ spinlock_t lock; -+ struct task_struct *owner; -+ int nestcnt; -+ -+#elif defined(CONFIG_DEBUG_LOCK_ALLOC) - struct lockdep_map dep_map; - struct task_struct *owner; - #endif - } local_lock_t; - --#ifdef CONFIG_DEBUG_LOCK_ALLOC --# define LL_DEP_MAP_INIT(lockname) \ -+#ifdef CONFIG_PREEMPT_RT -+ -+#define INIT_LOCAL_LOCK(lockname) { \ -+ __SPIN_LOCK_UNLOCKED((lockname).lock), \ -+ .owner = NULL, \ -+ .nestcnt = 0, \ -+ } -+#else -+ -+# ifdef CONFIG_DEBUG_LOCK_ALLOC -+# define LL_DEP_MAP_INIT(lockname) \ - .dep_map = { \ - .name = #lockname, \ - .wait_type_inner = LD_WAIT_CONFIG, \ - } --#else --# define LL_DEP_MAP_INIT(lockname) --#endif -+# else -+# define LL_DEP_MAP_INIT(lockname) -+# endif - - #define INIT_LOCAL_LOCK(lockname) { LL_DEP_MAP_INIT(lockname) } - --#define __local_lock_init(lock) \ -+#endif -+ -+#ifdef CONFIG_PREEMPT_RT -+ -+static inline void ___local_lock_init(local_lock_t *l) -+{ -+ l->owner = NULL; -+ l->nestcnt = 0; -+} -+ -+#define __local_lock_init(l) \ -+do { \ -+ spin_lock_init(&(l)->lock); \ -+ ___local_lock_init(l); \ -+} while (0) -+ -+#else -+ -+#define __local_lock_init(l) \ - do { \ - static struct lock_class_key __key; \ - \ -- debug_check_no_locks_freed((void *)lock, sizeof(*lock));\ -- lockdep_init_map_wait(&(lock)->dep_map, #lock, &__key, 0, LD_WAIT_CONFIG);\ -+ debug_check_no_locks_freed((void *)l, sizeof(*l)); \ -+ lockdep_init_map_wait(&(l)->dep_map, #l, &__key, 0, LD_WAIT_CONFIG);\ - } while (0) -+#endif -+ -+#ifdef CONFIG_PREEMPT_RT -+ -+static inline void local_lock_acquire(local_lock_t *l) -+{ -+ if (l->owner != current) { -+ spin_lock(&l->lock); -+ DEBUG_LOCKS_WARN_ON(l->owner); -+ DEBUG_LOCKS_WARN_ON(l->nestcnt); -+ l->owner = current; -+ } -+ l->nestcnt++; -+} -+ -+static inline void local_lock_release(local_lock_t *l) -+{ -+ DEBUG_LOCKS_WARN_ON(l->nestcnt == 0); -+ DEBUG_LOCKS_WARN_ON(l->owner != current); -+ if (--l->nestcnt) -+ return; -+ -+ l->owner = NULL; -+ spin_unlock(&l->lock); -+} - --#ifdef CONFIG_DEBUG_LOCK_ALLOC -+#elif defined(CONFIG_DEBUG_LOCK_ALLOC) - static inline void local_lock_acquire(local_lock_t *l) - { - lock_map_acquire(&l->dep_map); -@@ -53,21 +110,50 @@ static inline void local_lock_acquire(local_lock_t *l) { } - static inline void local_lock_release(local_lock_t *l) { } - #endif /* !CONFIG_DEBUG_LOCK_ALLOC */ - -+#ifdef CONFIG_PREEMPT_RT -+ - #define __local_lock(lock) \ - do { \ -- preempt_disable(); \ -+ migrate_disable(); \ - local_lock_acquire(this_cpu_ptr(lock)); \ - } while (0) - -+#define __local_unlock(lock) \ -+ do { \ -+ local_lock_release(this_cpu_ptr(lock)); \ -+ migrate_enable(); \ -+ } while (0) -+ - #define __local_lock_irq(lock) \ - do { \ -- local_irq_disable(); \ -+ migrate_disable(); \ - local_lock_acquire(this_cpu_ptr(lock)); \ - } while (0) - - #define __local_lock_irqsave(lock, flags) \ - do { \ -- local_irq_save(flags); \ -+ migrate_disable(); \ -+ flags = 0; \ -+ local_lock_acquire(this_cpu_ptr(lock)); \ -+ } while (0) -+ -+#define __local_unlock_irq(lock) \ -+ do { \ -+ local_lock_release(this_cpu_ptr(lock)); \ -+ migrate_enable(); \ -+ } while (0) -+ -+#define __local_unlock_irqrestore(lock, flags) \ -+ do { \ -+ local_lock_release(this_cpu_ptr(lock)); \ -+ migrate_enable(); \ -+ } while (0) -+ -+#else -+ -+#define __local_lock(lock) \ -+ do { \ -+ preempt_disable(); \ - local_lock_acquire(this_cpu_ptr(lock)); \ - } while (0) - -@@ -77,6 +163,18 @@ static inline void local_lock_release(local_lock_t *l) { } - preempt_enable(); \ - } while (0) - -+#define __local_lock_irq(lock) \ -+ do { \ -+ local_irq_disable(); \ -+ local_lock_acquire(this_cpu_ptr(lock)); \ -+ } while (0) -+ -+#define __local_lock_irqsave(lock, flags) \ -+ do { \ -+ local_irq_save(flags); \ -+ local_lock_acquire(this_cpu_ptr(lock)); \ -+ } while (0) -+ - #define __local_unlock_irq(lock) \ - do { \ - local_lock_release(this_cpu_ptr(lock)); \ -@@ -88,3 +186,5 @@ static inline void local_lock_release(local_lock_t *l) { } - local_lock_release(this_cpu_ptr(lock)); \ - local_irq_restore(flags); \ - } while (0) -+ -+#endif --- -2.30.2 - diff --git a/debian/patches-rt/0199-Split-IRQ-off-and-zone-lock-while-freeing-pages-from.patch b/debian/patches-rt/0199-Split-IRQ-off-and-zone-lock-while-freeing-pages-from.patch deleted file mode 100644 index 3cd6409c0..000000000 --- a/debian/patches-rt/0199-Split-IRQ-off-and-zone-lock-while-freeing-pages-from.patch +++ /dev/null @@ -1,172 +0,0 @@ -From 11586581f2850f2a3fbfe9772c0dca796aa5fe88 Mon Sep 17 00:00:00 2001 -From: Peter Zijlstra <peterz@infradead.org> -Date: Mon, 28 May 2018 15:24:20 +0200 -Subject: [PATCH 199/296] Split IRQ-off and zone->lock while freeing pages from - PCP list #1 -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Split the IRQ-off section while accessing the PCP list from zone->lock -while freeing pages. -Introcude isolate_pcp_pages() which separates the pages from the PCP -list onto a temporary list and then free the temporary list via -free_pcppages_bulk(). - -Signed-off-by: Peter Zijlstra <peterz@infradead.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - mm/page_alloc.c | 81 +++++++++++++++++++++++++++++++------------------ - 1 file changed, 51 insertions(+), 30 deletions(-) - -diff --git a/mm/page_alloc.c b/mm/page_alloc.c -index 7ffa706e5c30..0ad4f4aaa36d 100644 ---- a/mm/page_alloc.c -+++ b/mm/page_alloc.c -@@ -1331,7 +1331,7 @@ static inline void prefetch_buddy(struct page *page) - } - - /* -- * Frees a number of pages from the PCP lists -+ * Frees a number of pages which have been collected from the pcp lists. - * Assumes all pages on list are in same zone, and of same order. - * count is the number of pages to free. - * -@@ -1342,14 +1342,40 @@ static inline void prefetch_buddy(struct page *page) - * pinned" detection logic. - */ - static void free_pcppages_bulk(struct zone *zone, int count, -- struct per_cpu_pages *pcp) -+ struct list_head *head) -+{ -+ bool isolated_pageblocks; -+ struct page *page, *tmp; -+ unsigned long flags; -+ -+ spin_lock_irqsave(&zone->lock, flags); -+ isolated_pageblocks = has_isolate_pageblock(zone); -+ -+ /* -+ * Use safe version since after __free_one_page(), -+ * page->lru.next will not point to original list. -+ */ -+ list_for_each_entry_safe(page, tmp, head, lru) { -+ int mt = get_pcppage_migratetype(page); -+ /* MIGRATE_ISOLATE page should not go to pcplists */ -+ VM_BUG_ON_PAGE(is_migrate_isolate(mt), page); -+ /* Pageblock could have been isolated meanwhile */ -+ if (unlikely(isolated_pageblocks)) -+ mt = get_pageblock_migratetype(page); -+ -+ __free_one_page(page, page_to_pfn(page), zone, 0, mt, FPI_NONE); -+ trace_mm_page_pcpu_drain(page, 0, mt); -+ } -+ spin_unlock_irqrestore(&zone->lock, flags); -+} -+ -+static void isolate_pcp_pages(int count, struct per_cpu_pages *pcp, -+ struct list_head *dst) - { - int migratetype = 0; - int batch_free = 0; - int prefetch_nr = 0; -- bool isolated_pageblocks; -- struct page *page, *tmp; -- LIST_HEAD(head); -+ struct page *page; - - /* - * Ensure proper count is passed which otherwise would stuck in the -@@ -1386,7 +1412,7 @@ static void free_pcppages_bulk(struct zone *zone, int count, - if (bulkfree_pcp_prepare(page)) - continue; - -- list_add_tail(&page->lru, &head); -+ list_add_tail(&page->lru, dst); - - /* - * We are going to put the page back to the global -@@ -1401,26 +1427,6 @@ static void free_pcppages_bulk(struct zone *zone, int count, - prefetch_buddy(page); - } while (--count && --batch_free && !list_empty(list)); - } -- -- spin_lock(&zone->lock); -- isolated_pageblocks = has_isolate_pageblock(zone); -- -- /* -- * Use safe version since after __free_one_page(), -- * page->lru.next will not point to original list. -- */ -- list_for_each_entry_safe(page, tmp, &head, lru) { -- int mt = get_pcppage_migratetype(page); -- /* MIGRATE_ISOLATE page should not go to pcplists */ -- VM_BUG_ON_PAGE(is_migrate_isolate(mt), page); -- /* Pageblock could have been isolated meanwhile */ -- if (unlikely(isolated_pageblocks)) -- mt = get_pageblock_migratetype(page); -- -- __free_one_page(page, page_to_pfn(page), zone, 0, mt, FPI_NONE); -- trace_mm_page_pcpu_drain(page, 0, mt); -- } -- spin_unlock(&zone->lock); - } - - static void free_one_page(struct zone *zone, -@@ -2938,13 +2944,18 @@ void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp) - { - unsigned long flags; - int to_drain, batch; -+ LIST_HEAD(dst); - - local_irq_save(flags); - batch = READ_ONCE(pcp->batch); - to_drain = min(pcp->count, batch); - if (to_drain > 0) -- free_pcppages_bulk(zone, to_drain, pcp); -+ isolate_pcp_pages(to_drain, pcp, &dst); -+ - local_irq_restore(flags); -+ -+ if (to_drain > 0) -+ free_pcppages_bulk(zone, to_drain, &dst); - } - #endif - -@@ -2960,14 +2971,21 @@ static void drain_pages_zone(unsigned int cpu, struct zone *zone) - unsigned long flags; - struct per_cpu_pageset *pset; - struct per_cpu_pages *pcp; -+ LIST_HEAD(dst); -+ int count; - - local_irq_save(flags); - pset = per_cpu_ptr(zone->pageset, cpu); - - pcp = &pset->pcp; -- if (pcp->count) -- free_pcppages_bulk(zone, pcp->count, pcp); -+ count = pcp->count; -+ if (count) -+ isolate_pcp_pages(count, pcp, &dst); -+ - local_irq_restore(flags); -+ -+ if (count) -+ free_pcppages_bulk(zone, count, &dst); - } - - /* -@@ -3196,7 +3214,10 @@ static void free_unref_page_commit(struct page *page, unsigned long pfn) - pcp->count++; - if (pcp->count >= pcp->high) { - unsigned long batch = READ_ONCE(pcp->batch); -- free_pcppages_bulk(zone, batch, pcp); -+ LIST_HEAD(dst); -+ -+ isolate_pcp_pages(batch, pcp, &dst); -+ free_pcppages_bulk(zone, batch, &dst); - } - } - --- -2.30.2 - diff --git a/debian/patches-rt/0200-Split-IRQ-off-and-zone-lock-while-freeing-pages-from.patch b/debian/patches-rt/0200-Split-IRQ-off-and-zone-lock-while-freeing-pages-from.patch deleted file mode 100644 index b124b1d93..000000000 --- a/debian/patches-rt/0200-Split-IRQ-off-and-zone-lock-while-freeing-pages-from.patch +++ /dev/null @@ -1,172 +0,0 @@ -From 34792dc89bdd9aa85005135f66a19f40eb8ea229 Mon Sep 17 00:00:00 2001 -From: Peter Zijlstra <peterz@infradead.org> -Date: Mon, 28 May 2018 15:24:21 +0200 -Subject: [PATCH 200/296] Split IRQ-off and zone->lock while freeing pages from - PCP list #2 -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Split the IRQ-off section while accessing the PCP list from zone->lock -while freeing pages. -Introcude isolate_pcp_pages() which separates the pages from the PCP -list onto a temporary list and then free the temporary list via -free_pcppages_bulk(). - -Signed-off-by: Peter Zijlstra <peterz@infradead.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - mm/page_alloc.c | 60 ++++++++++++++++++++++++++++++++++++++++--------- - 1 file changed, 50 insertions(+), 10 deletions(-) - -diff --git a/mm/page_alloc.c b/mm/page_alloc.c -index 0ad4f4aaa36d..84ee66afc232 100644 ---- a/mm/page_alloc.c -+++ b/mm/page_alloc.c -@@ -1341,8 +1341,8 @@ static inline void prefetch_buddy(struct page *page) - * And clear the zone's pages_scanned counter, to hold off the "all pages are - * pinned" detection logic. - */ --static void free_pcppages_bulk(struct zone *zone, int count, -- struct list_head *head) -+static void free_pcppages_bulk(struct zone *zone, struct list_head *head, -+ bool zone_retry) - { - bool isolated_pageblocks; - struct page *page, *tmp; -@@ -1357,12 +1357,27 @@ static void free_pcppages_bulk(struct zone *zone, int count, - */ - list_for_each_entry_safe(page, tmp, head, lru) { - int mt = get_pcppage_migratetype(page); -+ -+ if (page_zone(page) != zone) { -+ /* -+ * free_unref_page_list() sorts pages by zone. If we end -+ * up with pages from a different NUMA nodes belonging -+ * to the same ZONE index then we need to redo with the -+ * correct ZONE pointer. Skip the page for now, redo it -+ * on the next iteration. -+ */ -+ WARN_ON_ONCE(zone_retry == false); -+ if (zone_retry) -+ continue; -+ } -+ - /* MIGRATE_ISOLATE page should not go to pcplists */ - VM_BUG_ON_PAGE(is_migrate_isolate(mt), page); - /* Pageblock could have been isolated meanwhile */ - if (unlikely(isolated_pageblocks)) - mt = get_pageblock_migratetype(page); - -+ list_del(&page->lru); - __free_one_page(page, page_to_pfn(page), zone, 0, mt, FPI_NONE); - trace_mm_page_pcpu_drain(page, 0, mt); - } -@@ -2955,7 +2970,7 @@ void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp) - local_irq_restore(flags); - - if (to_drain > 0) -- free_pcppages_bulk(zone, to_drain, &dst); -+ free_pcppages_bulk(zone, &dst, false); - } - #endif - -@@ -2985,7 +3000,7 @@ static void drain_pages_zone(unsigned int cpu, struct zone *zone) - local_irq_restore(flags); - - if (count) -- free_pcppages_bulk(zone, count, &dst); -+ free_pcppages_bulk(zone, &dst, false); - } - - /* -@@ -3184,7 +3199,8 @@ static bool free_unref_page_prepare(struct page *page, unsigned long pfn) - return true; - } - --static void free_unref_page_commit(struct page *page, unsigned long pfn) -+static void free_unref_page_commit(struct page *page, unsigned long pfn, -+ struct list_head *dst) - { - struct zone *zone = page_zone(page); - struct per_cpu_pages *pcp; -@@ -3214,10 +3230,8 @@ static void free_unref_page_commit(struct page *page, unsigned long pfn) - pcp->count++; - if (pcp->count >= pcp->high) { - unsigned long batch = READ_ONCE(pcp->batch); -- LIST_HEAD(dst); - -- isolate_pcp_pages(batch, pcp, &dst); -- free_pcppages_bulk(zone, batch, &dst); -+ isolate_pcp_pages(batch, pcp, dst); - } - } - -@@ -3228,13 +3242,17 @@ void free_unref_page(struct page *page) - { - unsigned long flags; - unsigned long pfn = page_to_pfn(page); -+ struct zone *zone = page_zone(page); -+ LIST_HEAD(dst); - - if (!free_unref_page_prepare(page, pfn)) - return; - - local_irq_save(flags); -- free_unref_page_commit(page, pfn); -+ free_unref_page_commit(page, pfn, &dst); - local_irq_restore(flags); -+ if (!list_empty(&dst)) -+ free_pcppages_bulk(zone, &dst, false); - } - - /* -@@ -3245,6 +3263,11 @@ void free_unref_page_list(struct list_head *list) - struct page *page, *next; - unsigned long flags, pfn; - int batch_count = 0; -+ struct list_head dsts[__MAX_NR_ZONES]; -+ int i; -+ -+ for (i = 0; i < __MAX_NR_ZONES; i++) -+ INIT_LIST_HEAD(&dsts[i]); - - /* Prepare pages for freeing */ - list_for_each_entry_safe(page, next, list, lru) { -@@ -3257,10 +3280,12 @@ void free_unref_page_list(struct list_head *list) - local_irq_save(flags); - list_for_each_entry_safe(page, next, list, lru) { - unsigned long pfn = page_private(page); -+ enum zone_type type; - - set_page_private(page, 0); - trace_mm_page_free_batched(page); -- free_unref_page_commit(page, pfn); -+ type = page_zonenum(page); -+ free_unref_page_commit(page, pfn, &dsts[type]); - - /* - * Guard against excessive IRQ disabled times when we get -@@ -3273,6 +3298,21 @@ void free_unref_page_list(struct list_head *list) - } - } - local_irq_restore(flags); -+ -+ for (i = 0; i < __MAX_NR_ZONES; ) { -+ struct page *page; -+ struct zone *zone; -+ -+ if (list_empty(&dsts[i])) { -+ i++; -+ continue; -+ } -+ -+ page = list_first_entry(&dsts[i], struct page, lru); -+ zone = page_zone(page); -+ -+ free_pcppages_bulk(zone, &dsts[i], true); -+ } - } - - /* --- -2.30.2 - diff --git a/debian/patches-rt/0201-mm-SLxB-change-list_lock-to-raw_spinlock_t.patch b/debian/patches-rt/0201-mm-SLxB-change-list_lock-to-raw_spinlock_t.patch deleted file mode 100644 index 9cfe29d99..000000000 --- a/debian/patches-rt/0201-mm-SLxB-change-list_lock-to-raw_spinlock_t.patch +++ /dev/null @@ -1,603 +0,0 @@ -From 3da16c3dad48b9fffed0477f45dffff8672958ff Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Mon, 28 May 2018 15:24:22 +0200 -Subject: [PATCH 201/296] mm/SLxB: change list_lock to raw_spinlock_t -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The list_lock is used with used with IRQs off on RT. Make it a raw_spinlock_t -otherwise the interrupts won't be disabled on -RT. The locking rules remain -the same on !RT. -This patch changes it for SLAB and SLUB since both share the same header -file for struct kmem_cache_node defintion. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - mm/slab.c | 90 +++++++++++++++++++++++++++---------------------------- - mm/slab.h | 2 +- - mm/slub.c | 50 +++++++++++++++---------------- - 3 files changed, 71 insertions(+), 71 deletions(-) - -diff --git a/mm/slab.c b/mm/slab.c -index b1113561b98b..a28b54325d9e 100644 ---- a/mm/slab.c -+++ b/mm/slab.c -@@ -233,7 +233,7 @@ static void kmem_cache_node_init(struct kmem_cache_node *parent) - parent->shared = NULL; - parent->alien = NULL; - parent->colour_next = 0; -- spin_lock_init(&parent->list_lock); -+ raw_spin_lock_init(&parent->list_lock); - parent->free_objects = 0; - parent->free_touched = 0; - } -@@ -558,9 +558,9 @@ static noinline void cache_free_pfmemalloc(struct kmem_cache *cachep, - page_node = page_to_nid(page); - n = get_node(cachep, page_node); - -- spin_lock(&n->list_lock); -+ raw_spin_lock(&n->list_lock); - free_block(cachep, &objp, 1, page_node, &list); -- spin_unlock(&n->list_lock); -+ raw_spin_unlock(&n->list_lock); - - slabs_destroy(cachep, &list); - } -@@ -698,7 +698,7 @@ static void __drain_alien_cache(struct kmem_cache *cachep, - struct kmem_cache_node *n = get_node(cachep, node); - - if (ac->avail) { -- spin_lock(&n->list_lock); -+ raw_spin_lock(&n->list_lock); - /* - * Stuff objects into the remote nodes shared array first. - * That way we could avoid the overhead of putting the objects -@@ -709,7 +709,7 @@ static void __drain_alien_cache(struct kmem_cache *cachep, - - free_block(cachep, ac->entry, ac->avail, node, list); - ac->avail = 0; -- spin_unlock(&n->list_lock); -+ raw_spin_unlock(&n->list_lock); - } - } - -@@ -782,9 +782,9 @@ static int __cache_free_alien(struct kmem_cache *cachep, void *objp, - slabs_destroy(cachep, &list); - } else { - n = get_node(cachep, page_node); -- spin_lock(&n->list_lock); -+ raw_spin_lock(&n->list_lock); - free_block(cachep, &objp, 1, page_node, &list); -- spin_unlock(&n->list_lock); -+ raw_spin_unlock(&n->list_lock); - slabs_destroy(cachep, &list); - } - return 1; -@@ -825,10 +825,10 @@ static int init_cache_node(struct kmem_cache *cachep, int node, gfp_t gfp) - */ - n = get_node(cachep, node); - if (n) { -- spin_lock_irq(&n->list_lock); -+ raw_spin_lock_irq(&n->list_lock); - n->free_limit = (1 + nr_cpus_node(node)) * cachep->batchcount + - cachep->num; -- spin_unlock_irq(&n->list_lock); -+ raw_spin_unlock_irq(&n->list_lock); - - return 0; - } -@@ -907,7 +907,7 @@ static int setup_kmem_cache_node(struct kmem_cache *cachep, - goto fail; - - n = get_node(cachep, node); -- spin_lock_irq(&n->list_lock); -+ raw_spin_lock_irq(&n->list_lock); - if (n->shared && force_change) { - free_block(cachep, n->shared->entry, - n->shared->avail, node, &list); -@@ -925,7 +925,7 @@ static int setup_kmem_cache_node(struct kmem_cache *cachep, - new_alien = NULL; - } - -- spin_unlock_irq(&n->list_lock); -+ raw_spin_unlock_irq(&n->list_lock); - slabs_destroy(cachep, &list); - - /* -@@ -964,7 +964,7 @@ static void cpuup_canceled(long cpu) - if (!n) - continue; - -- spin_lock_irq(&n->list_lock); -+ raw_spin_lock_irq(&n->list_lock); - - /* Free limit for this kmem_cache_node */ - n->free_limit -= cachep->batchcount; -@@ -975,7 +975,7 @@ static void cpuup_canceled(long cpu) - nc->avail = 0; - - if (!cpumask_empty(mask)) { -- spin_unlock_irq(&n->list_lock); -+ raw_spin_unlock_irq(&n->list_lock); - goto free_slab; - } - -@@ -989,7 +989,7 @@ static void cpuup_canceled(long cpu) - alien = n->alien; - n->alien = NULL; - -- spin_unlock_irq(&n->list_lock); -+ raw_spin_unlock_irq(&n->list_lock); - - kfree(shared); - if (alien) { -@@ -1173,7 +1173,7 @@ static void __init init_list(struct kmem_cache *cachep, struct kmem_cache_node * - /* - * Do not assume that spinlocks can be initialized via memcpy: - */ -- spin_lock_init(&ptr->list_lock); -+ raw_spin_lock_init(&ptr->list_lock); - - MAKE_ALL_LISTS(cachep, ptr, nodeid); - cachep->node[nodeid] = ptr; -@@ -1344,11 +1344,11 @@ slab_out_of_memory(struct kmem_cache *cachep, gfp_t gfpflags, int nodeid) - for_each_kmem_cache_node(cachep, node, n) { - unsigned long total_slabs, free_slabs, free_objs; - -- spin_lock_irqsave(&n->list_lock, flags); -+ raw_spin_lock_irqsave(&n->list_lock, flags); - total_slabs = n->total_slabs; - free_slabs = n->free_slabs; - free_objs = n->free_objects; -- spin_unlock_irqrestore(&n->list_lock, flags); -+ raw_spin_unlock_irqrestore(&n->list_lock, flags); - - pr_warn(" node %d: slabs: %ld/%ld, objs: %ld/%ld\n", - node, total_slabs - free_slabs, total_slabs, -@@ -2106,7 +2106,7 @@ static void check_spinlock_acquired(struct kmem_cache *cachep) - { - #ifdef CONFIG_SMP - check_irq_off(); -- assert_spin_locked(&get_node(cachep, numa_mem_id())->list_lock); -+ assert_raw_spin_locked(&get_node(cachep, numa_mem_id())->list_lock); - #endif - } - -@@ -2114,7 +2114,7 @@ static void check_spinlock_acquired_node(struct kmem_cache *cachep, int node) - { - #ifdef CONFIG_SMP - check_irq_off(); -- assert_spin_locked(&get_node(cachep, node)->list_lock); -+ assert_raw_spin_locked(&get_node(cachep, node)->list_lock); - #endif - } - -@@ -2154,9 +2154,9 @@ static void do_drain(void *arg) - check_irq_off(); - ac = cpu_cache_get(cachep); - n = get_node(cachep, node); -- spin_lock(&n->list_lock); -+ raw_spin_lock(&n->list_lock); - free_block(cachep, ac->entry, ac->avail, node, &list); -- spin_unlock(&n->list_lock); -+ raw_spin_unlock(&n->list_lock); - ac->avail = 0; - slabs_destroy(cachep, &list); - } -@@ -2174,9 +2174,9 @@ static void drain_cpu_caches(struct kmem_cache *cachep) - drain_alien_cache(cachep, n->alien); - - for_each_kmem_cache_node(cachep, node, n) { -- spin_lock_irq(&n->list_lock); -+ raw_spin_lock_irq(&n->list_lock); - drain_array_locked(cachep, n->shared, node, true, &list); -- spin_unlock_irq(&n->list_lock); -+ raw_spin_unlock_irq(&n->list_lock); - - slabs_destroy(cachep, &list); - } -@@ -2198,10 +2198,10 @@ static int drain_freelist(struct kmem_cache *cache, - nr_freed = 0; - while (nr_freed < tofree && !list_empty(&n->slabs_free)) { - -- spin_lock_irq(&n->list_lock); -+ raw_spin_lock_irq(&n->list_lock); - p = n->slabs_free.prev; - if (p == &n->slabs_free) { -- spin_unlock_irq(&n->list_lock); -+ raw_spin_unlock_irq(&n->list_lock); - goto out; - } - -@@ -2214,7 +2214,7 @@ static int drain_freelist(struct kmem_cache *cache, - * to the cache. - */ - n->free_objects -= cache->num; -- spin_unlock_irq(&n->list_lock); -+ raw_spin_unlock_irq(&n->list_lock); - slab_destroy(cache, page); - nr_freed++; - } -@@ -2650,7 +2650,7 @@ static void cache_grow_end(struct kmem_cache *cachep, struct page *page) - INIT_LIST_HEAD(&page->slab_list); - n = get_node(cachep, page_to_nid(page)); - -- spin_lock(&n->list_lock); -+ raw_spin_lock(&n->list_lock); - n->total_slabs++; - if (!page->active) { - list_add_tail(&page->slab_list, &n->slabs_free); -@@ -2660,7 +2660,7 @@ static void cache_grow_end(struct kmem_cache *cachep, struct page *page) - - STATS_INC_GROWN(cachep); - n->free_objects += cachep->num - page->active; -- spin_unlock(&n->list_lock); -+ raw_spin_unlock(&n->list_lock); - - fixup_objfreelist_debug(cachep, &list); - } -@@ -2826,7 +2826,7 @@ static struct page *get_first_slab(struct kmem_cache_node *n, bool pfmemalloc) - { - struct page *page; - -- assert_spin_locked(&n->list_lock); -+ assert_raw_spin_locked(&n->list_lock); - page = list_first_entry_or_null(&n->slabs_partial, struct page, - slab_list); - if (!page) { -@@ -2853,10 +2853,10 @@ static noinline void *cache_alloc_pfmemalloc(struct kmem_cache *cachep, - if (!gfp_pfmemalloc_allowed(flags)) - return NULL; - -- spin_lock(&n->list_lock); -+ raw_spin_lock(&n->list_lock); - page = get_first_slab(n, true); - if (!page) { -- spin_unlock(&n->list_lock); -+ raw_spin_unlock(&n->list_lock); - return NULL; - } - -@@ -2865,7 +2865,7 @@ static noinline void *cache_alloc_pfmemalloc(struct kmem_cache *cachep, - - fixup_slab_list(cachep, n, page, &list); - -- spin_unlock(&n->list_lock); -+ raw_spin_unlock(&n->list_lock); - fixup_objfreelist_debug(cachep, &list); - - return obj; -@@ -2924,7 +2924,7 @@ static void *cache_alloc_refill(struct kmem_cache *cachep, gfp_t flags) - if (!n->free_objects && (!shared || !shared->avail)) - goto direct_grow; - -- spin_lock(&n->list_lock); -+ raw_spin_lock(&n->list_lock); - shared = READ_ONCE(n->shared); - - /* See if we can refill from the shared array */ -@@ -2948,7 +2948,7 @@ static void *cache_alloc_refill(struct kmem_cache *cachep, gfp_t flags) - must_grow: - n->free_objects -= ac->avail; - alloc_done: -- spin_unlock(&n->list_lock); -+ raw_spin_unlock(&n->list_lock); - fixup_objfreelist_debug(cachep, &list); - - direct_grow: -@@ -3173,7 +3173,7 @@ static void *____cache_alloc_node(struct kmem_cache *cachep, gfp_t flags, - BUG_ON(!n); - - check_irq_off(); -- spin_lock(&n->list_lock); -+ raw_spin_lock(&n->list_lock); - page = get_first_slab(n, false); - if (!page) - goto must_grow; -@@ -3191,12 +3191,12 @@ static void *____cache_alloc_node(struct kmem_cache *cachep, gfp_t flags, - - fixup_slab_list(cachep, n, page, &list); - -- spin_unlock(&n->list_lock); -+ raw_spin_unlock(&n->list_lock); - fixup_objfreelist_debug(cachep, &list); - return obj; - - must_grow: -- spin_unlock(&n->list_lock); -+ raw_spin_unlock(&n->list_lock); - page = cache_grow_begin(cachep, gfp_exact_node(flags), nodeid); - if (page) { - /* This slab isn't counted yet so don't update free_objects */ -@@ -3374,7 +3374,7 @@ static void cache_flusharray(struct kmem_cache *cachep, struct array_cache *ac) - - check_irq_off(); - n = get_node(cachep, node); -- spin_lock(&n->list_lock); -+ raw_spin_lock(&n->list_lock); - if (n->shared) { - struct array_cache *shared_array = n->shared; - int max = shared_array->limit - shared_array->avail; -@@ -3403,7 +3403,7 @@ static void cache_flusharray(struct kmem_cache *cachep, struct array_cache *ac) - STATS_SET_FREEABLE(cachep, i); - } - #endif -- spin_unlock(&n->list_lock); -+ raw_spin_unlock(&n->list_lock); - ac->avail -= batchcount; - memmove(ac->entry, &(ac->entry[batchcount]), sizeof(void *)*ac->avail); - slabs_destroy(cachep, &list); -@@ -3832,9 +3832,9 @@ static int do_tune_cpucache(struct kmem_cache *cachep, int limit, - - node = cpu_to_mem(cpu); - n = get_node(cachep, node); -- spin_lock_irq(&n->list_lock); -+ raw_spin_lock_irq(&n->list_lock); - free_block(cachep, ac->entry, ac->avail, node, &list); -- spin_unlock_irq(&n->list_lock); -+ raw_spin_unlock_irq(&n->list_lock); - slabs_destroy(cachep, &list); - } - free_percpu(prev); -@@ -3929,9 +3929,9 @@ static void drain_array(struct kmem_cache *cachep, struct kmem_cache_node *n, - return; - } - -- spin_lock_irq(&n->list_lock); -+ raw_spin_lock_irq(&n->list_lock); - drain_array_locked(cachep, ac, node, false, &list); -- spin_unlock_irq(&n->list_lock); -+ raw_spin_unlock_irq(&n->list_lock); - - slabs_destroy(cachep, &list); - } -@@ -4015,7 +4015,7 @@ void get_slabinfo(struct kmem_cache *cachep, struct slabinfo *sinfo) - - for_each_kmem_cache_node(cachep, node, n) { - check_irq_on(); -- spin_lock_irq(&n->list_lock); -+ raw_spin_lock_irq(&n->list_lock); - - total_slabs += n->total_slabs; - free_slabs += n->free_slabs; -@@ -4024,7 +4024,7 @@ void get_slabinfo(struct kmem_cache *cachep, struct slabinfo *sinfo) - if (n->shared) - shared_avail += n->shared->avail; - -- spin_unlock_irq(&n->list_lock); -+ raw_spin_unlock_irq(&n->list_lock); - } - num_objs = total_slabs * cachep->num; - active_slabs = total_slabs - free_slabs; -diff --git a/mm/slab.h b/mm/slab.h -index f9977d6613d6..c9a43b787609 100644 ---- a/mm/slab.h -+++ b/mm/slab.h -@@ -546,7 +546,7 @@ static inline void slab_post_alloc_hook(struct kmem_cache *s, - * The slab lists for all objects. - */ - struct kmem_cache_node { -- spinlock_t list_lock; -+ raw_spinlock_t list_lock; - - #ifdef CONFIG_SLAB - struct list_head slabs_partial; /* partial list first, better asm code */ -diff --git a/mm/slub.c b/mm/slub.c -index fbc415c34009..ff543c29c3c7 100644 ---- a/mm/slub.c -+++ b/mm/slub.c -@@ -1213,7 +1213,7 @@ static noinline int free_debug_processing( - unsigned long flags; - int ret = 0; - -- spin_lock_irqsave(&n->list_lock, flags); -+ raw_spin_lock_irqsave(&n->list_lock, flags); - slab_lock(page); - - if (s->flags & SLAB_CONSISTENCY_CHECKS) { -@@ -1248,7 +1248,7 @@ static noinline int free_debug_processing( - bulk_cnt, cnt); - - slab_unlock(page); -- spin_unlock_irqrestore(&n->list_lock, flags); -+ raw_spin_unlock_irqrestore(&n->list_lock, flags); - if (!ret) - slab_fix(s, "Object at 0x%p not freed", object); - return ret; -@@ -1962,7 +1962,7 @@ static void *get_partial_node(struct kmem_cache *s, struct kmem_cache_node *n, - if (!n || !n->nr_partial) - return NULL; - -- spin_lock(&n->list_lock); -+ raw_spin_lock(&n->list_lock); - list_for_each_entry_safe(page, page2, &n->partial, slab_list) { - void *t; - -@@ -1987,7 +1987,7 @@ static void *get_partial_node(struct kmem_cache *s, struct kmem_cache_node *n, - break; - - } -- spin_unlock(&n->list_lock); -+ raw_spin_unlock(&n->list_lock); - return object; - } - -@@ -2241,7 +2241,7 @@ static void deactivate_slab(struct kmem_cache *s, struct page *page, - * that acquire_slab() will see a slab page that - * is frozen - */ -- spin_lock(&n->list_lock); -+ raw_spin_lock(&n->list_lock); - } - } else { - m = M_FULL; -@@ -2253,7 +2253,7 @@ static void deactivate_slab(struct kmem_cache *s, struct page *page, - * slabs from diagnostic functions will not see - * any frozen slabs. - */ -- spin_lock(&n->list_lock); -+ raw_spin_lock(&n->list_lock); - } - #endif - } -@@ -2278,7 +2278,7 @@ static void deactivate_slab(struct kmem_cache *s, struct page *page, - goto redo; - - if (lock) -- spin_unlock(&n->list_lock); -+ raw_spin_unlock(&n->list_lock); - - if (m == M_PARTIAL) - stat(s, tail); -@@ -2317,10 +2317,10 @@ static void unfreeze_partials(struct kmem_cache *s, - n2 = get_node(s, page_to_nid(page)); - if (n != n2) { - if (n) -- spin_unlock(&n->list_lock); -+ raw_spin_unlock(&n->list_lock); - - n = n2; -- spin_lock(&n->list_lock); -+ raw_spin_lock(&n->list_lock); - } - - do { -@@ -2349,7 +2349,7 @@ static void unfreeze_partials(struct kmem_cache *s, - } - - if (n) -- spin_unlock(&n->list_lock); -+ raw_spin_unlock(&n->list_lock); - - while (discard_page) { - page = discard_page; -@@ -2516,10 +2516,10 @@ static unsigned long count_partial(struct kmem_cache_node *n, - unsigned long x = 0; - struct page *page; - -- spin_lock_irqsave(&n->list_lock, flags); -+ raw_spin_lock_irqsave(&n->list_lock, flags); - list_for_each_entry(page, &n->partial, slab_list) - x += get_count(page); -- spin_unlock_irqrestore(&n->list_lock, flags); -+ raw_spin_unlock_irqrestore(&n->list_lock, flags); - return x; - } - #endif /* CONFIG_SLUB_DEBUG || CONFIG_SYSFS */ -@@ -2978,7 +2978,7 @@ static void __slab_free(struct kmem_cache *s, struct page *page, - - do { - if (unlikely(n)) { -- spin_unlock_irqrestore(&n->list_lock, flags); -+ raw_spin_unlock_irqrestore(&n->list_lock, flags); - n = NULL; - } - prior = page->freelist; -@@ -3010,7 +3010,7 @@ static void __slab_free(struct kmem_cache *s, struct page *page, - * Otherwise the list_lock will synchronize with - * other processors updating the list of slabs. - */ -- spin_lock_irqsave(&n->list_lock, flags); -+ raw_spin_lock_irqsave(&n->list_lock, flags); - - } - } -@@ -3052,7 +3052,7 @@ static void __slab_free(struct kmem_cache *s, struct page *page, - add_partial(n, page, DEACTIVATE_TO_TAIL); - stat(s, FREE_ADD_PARTIAL); - } -- spin_unlock_irqrestore(&n->list_lock, flags); -+ raw_spin_unlock_irqrestore(&n->list_lock, flags); - return; - - slab_empty: -@@ -3067,7 +3067,7 @@ static void __slab_free(struct kmem_cache *s, struct page *page, - remove_full(s, n, page); - } - -- spin_unlock_irqrestore(&n->list_lock, flags); -+ raw_spin_unlock_irqrestore(&n->list_lock, flags); - stat(s, FREE_SLAB); - discard_slab(s, page); - } -@@ -3472,7 +3472,7 @@ static void - init_kmem_cache_node(struct kmem_cache_node *n) - { - n->nr_partial = 0; -- spin_lock_init(&n->list_lock); -+ raw_spin_lock_init(&n->list_lock); - INIT_LIST_HEAD(&n->partial); - #ifdef CONFIG_SLUB_DEBUG - atomic_long_set(&n->nr_slabs, 0); -@@ -3873,7 +3873,7 @@ static void free_partial(struct kmem_cache *s, struct kmem_cache_node *n) - struct page *page, *h; - - BUG_ON(irqs_disabled()); -- spin_lock_irq(&n->list_lock); -+ raw_spin_lock_irq(&n->list_lock); - list_for_each_entry_safe(page, h, &n->partial, slab_list) { - if (!page->inuse) { - remove_partial(n, page); -@@ -3883,7 +3883,7 @@ static void free_partial(struct kmem_cache *s, struct kmem_cache_node *n) - "Objects remaining in %s on __kmem_cache_shutdown()"); - } - } -- spin_unlock_irq(&n->list_lock); -+ raw_spin_unlock_irq(&n->list_lock); - - list_for_each_entry_safe(page, h, &discard, slab_list) - discard_slab(s, page); -@@ -4154,7 +4154,7 @@ int __kmem_cache_shrink(struct kmem_cache *s) - for (i = 0; i < SHRINK_PROMOTE_MAX; i++) - INIT_LIST_HEAD(promote + i); - -- spin_lock_irqsave(&n->list_lock, flags); -+ raw_spin_lock_irqsave(&n->list_lock, flags); - - /* - * Build lists of slabs to discard or promote. -@@ -4185,7 +4185,7 @@ int __kmem_cache_shrink(struct kmem_cache *s) - for (i = SHRINK_PROMOTE_MAX - 1; i >= 0; i--) - list_splice(promote + i, &n->partial); - -- spin_unlock_irqrestore(&n->list_lock, flags); -+ raw_spin_unlock_irqrestore(&n->list_lock, flags); - - /* Release empty slabs */ - list_for_each_entry_safe(page, t, &discard, slab_list) -@@ -4547,7 +4547,7 @@ static int validate_slab_node(struct kmem_cache *s, - struct page *page; - unsigned long flags; - -- spin_lock_irqsave(&n->list_lock, flags); -+ raw_spin_lock_irqsave(&n->list_lock, flags); - - list_for_each_entry(page, &n->partial, slab_list) { - validate_slab(s, page); -@@ -4569,7 +4569,7 @@ static int validate_slab_node(struct kmem_cache *s, - s->name, count, atomic_long_read(&n->nr_slabs)); - - out: -- spin_unlock_irqrestore(&n->list_lock, flags); -+ raw_spin_unlock_irqrestore(&n->list_lock, flags); - return count; - } - -@@ -4748,12 +4748,12 @@ static int list_locations(struct kmem_cache *s, char *buf, - if (!atomic_long_read(&n->nr_slabs)) - continue; - -- spin_lock_irqsave(&n->list_lock, flags); -+ raw_spin_lock_irqsave(&n->list_lock, flags); - list_for_each_entry(page, &n->partial, slab_list) - process_slab(&t, s, page, alloc); - list_for_each_entry(page, &n->full, slab_list) - process_slab(&t, s, page, alloc); -- spin_unlock_irqrestore(&n->list_lock, flags); -+ raw_spin_unlock_irqrestore(&n->list_lock, flags); - } - - for (i = 0; i < t.count; i++) { --- -2.30.2 - diff --git a/debian/patches-rt/0202-mm-SLUB-delay-giving-back-empty-slubs-to-IRQ-enabled.patch b/debian/patches-rt/0202-mm-SLUB-delay-giving-back-empty-slubs-to-IRQ-enabled.patch deleted file mode 100644 index 6d2f3f3af..000000000 --- a/debian/patches-rt/0202-mm-SLUB-delay-giving-back-empty-slubs-to-IRQ-enabled.patch +++ /dev/null @@ -1,223 +0,0 @@ -From 02c7633c3140289c8082d34aed4bfbe746b323cb Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Thu, 21 Jun 2018 17:29:19 +0200 -Subject: [PATCH 202/296] mm/SLUB: delay giving back empty slubs to IRQ enabled - regions -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -__free_slab() is invoked with disabled interrupts which increases the -irq-off time while __free_pages() is doing the work. -Allow __free_slab() to be invoked with enabled interrupts and move -everything from interrupts-off invocations to a temporary per-CPU list -so it can be processed later. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - mm/slub.c | 74 +++++++++++++++++++++++++++++++++++++++++++++++++++---- - 1 file changed, 69 insertions(+), 5 deletions(-) - -diff --git a/mm/slub.c b/mm/slub.c -index ff543c29c3c7..8b1abf3277ec 100644 ---- a/mm/slub.c -+++ b/mm/slub.c -@@ -1496,6 +1496,12 @@ static bool freelist_corrupted(struct kmem_cache *s, struct page *page, - } - #endif /* CONFIG_SLUB_DEBUG */ - -+struct slub_free_list { -+ raw_spinlock_t lock; -+ struct list_head list; -+}; -+static DEFINE_PER_CPU(struct slub_free_list, slub_free_list); -+ - /* - * Hooks for other subsystems that check memory allocations. In a typical - * production configuration these hooks all should produce no code at all. -@@ -1844,6 +1850,16 @@ static void __free_slab(struct kmem_cache *s, struct page *page) - __free_pages(page, order); - } - -+static void free_delayed(struct list_head *h) -+{ -+ while (!list_empty(h)) { -+ struct page *page = list_first_entry(h, struct page, lru); -+ -+ list_del(&page->lru); -+ __free_slab(page->slab_cache, page); -+ } -+} -+ - static void rcu_free_slab(struct rcu_head *h) - { - struct page *page = container_of(h, struct page, rcu_head); -@@ -1855,6 +1871,12 @@ static void free_slab(struct kmem_cache *s, struct page *page) - { - if (unlikely(s->flags & SLAB_TYPESAFE_BY_RCU)) { - call_rcu(&page->rcu_head, rcu_free_slab); -+ } else if (irqs_disabled()) { -+ struct slub_free_list *f = this_cpu_ptr(&slub_free_list); -+ -+ raw_spin_lock(&f->lock); -+ list_add(&page->lru, &f->list); -+ raw_spin_unlock(&f->lock); - } else - __free_slab(s, page); - } -@@ -2386,14 +2408,21 @@ static void put_cpu_partial(struct kmem_cache *s, struct page *page, int drain) - pobjects = oldpage->pobjects; - pages = oldpage->pages; - if (drain && pobjects > slub_cpu_partial(s)) { -+ struct slub_free_list *f; - unsigned long flags; -+ LIST_HEAD(tofree); - /* - * partial array is full. Move the existing - * set to the per node partial list. - */ - local_irq_save(flags); - unfreeze_partials(s, this_cpu_ptr(s->cpu_slab)); -+ f = this_cpu_ptr(&slub_free_list); -+ raw_spin_lock(&f->lock); -+ list_splice_init(&f->list, &tofree); -+ raw_spin_unlock(&f->lock); - local_irq_restore(flags); -+ free_delayed(&tofree); - oldpage = NULL; - pobjects = 0; - pages = 0; -@@ -2461,7 +2490,22 @@ static bool has_cpu_slab(int cpu, void *info) - - static void flush_all(struct kmem_cache *s) - { -+ LIST_HEAD(tofree); -+ int cpu; -+ - on_each_cpu_cond(has_cpu_slab, flush_cpu_slab, s, 1); -+ for_each_online_cpu(cpu) { -+ struct slub_free_list *f; -+ -+ if (!has_cpu_slab(cpu, s)) -+ continue; -+ -+ f = &per_cpu(slub_free_list, cpu); -+ raw_spin_lock_irq(&f->lock); -+ list_splice_init(&f->list, &tofree); -+ raw_spin_unlock_irq(&f->lock); -+ free_delayed(&tofree); -+ } - } - - /* -@@ -2658,8 +2702,10 @@ static inline void *get_freelist(struct kmem_cache *s, struct page *page) - * already disabled (which is the case for bulk allocation). - */ - static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, -- unsigned long addr, struct kmem_cache_cpu *c) -+ unsigned long addr, struct kmem_cache_cpu *c, -+ struct list_head *to_free) - { -+ struct slub_free_list *f; - void *freelist; - struct page *page; - -@@ -2727,6 +2773,13 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, - VM_BUG_ON(!c->page->frozen); - c->freelist = get_freepointer(s, freelist); - c->tid = next_tid(c->tid); -+ -+out: -+ f = this_cpu_ptr(&slub_free_list); -+ raw_spin_lock(&f->lock); -+ list_splice_init(&f->list, to_free); -+ raw_spin_unlock(&f->lock); -+ - return freelist; - - new_slab: -@@ -2742,7 +2795,7 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, - - if (unlikely(!freelist)) { - slab_out_of_memory(s, gfpflags, node); -- return NULL; -+ goto out; - } - - page = c->page; -@@ -2755,7 +2808,7 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, - goto new_slab; /* Slab failed checks. Next slab needed */ - - deactivate_slab(s, page, get_freepointer(s, freelist), c); -- return freelist; -+ goto out; - } - - /* -@@ -2767,6 +2820,7 @@ static void *__slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, - { - void *p; - unsigned long flags; -+ LIST_HEAD(tofree); - - local_irq_save(flags); - #ifdef CONFIG_PREEMPTION -@@ -2778,8 +2832,9 @@ static void *__slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, - c = this_cpu_ptr(s->cpu_slab); - #endif - -- p = ___slab_alloc(s, gfpflags, node, addr, c); -+ p = ___slab_alloc(s, gfpflags, node, addr, c, &tofree); - local_irq_restore(flags); -+ free_delayed(&tofree); - return p; - } - -@@ -3275,6 +3330,7 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size, - void **p) - { - struct kmem_cache_cpu *c; -+ LIST_HEAD(to_free); - int i; - struct obj_cgroup *objcg = NULL; - -@@ -3308,7 +3364,7 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size, - * of re-populating per CPU c->freelist - */ - p[i] = ___slab_alloc(s, flags, NUMA_NO_NODE, -- _RET_IP_, c); -+ _RET_IP_, c, &to_free); - if (unlikely(!p[i])) - goto error; - -@@ -3323,6 +3379,7 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size, - } - c->tid = next_tid(c->tid); - local_irq_enable(); -+ free_delayed(&to_free); - - /* Clear memory outside IRQ disabled fastpath loop */ - if (unlikely(slab_want_init_on_alloc(flags, s))) { -@@ -3337,6 +3394,7 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size, - return i; - error: - local_irq_enable(); -+ free_delayed(&to_free); - slab_post_alloc_hook(s, objcg, flags, i, p); - __kmem_cache_free_bulk(s, i, p); - return 0; -@@ -4360,6 +4418,12 @@ void __init kmem_cache_init(void) - { - static __initdata struct kmem_cache boot_kmem_cache, - boot_kmem_cache_node; -+ int cpu; -+ -+ for_each_possible_cpu(cpu) { -+ raw_spin_lock_init(&per_cpu(slub_free_list, cpu).lock); -+ INIT_LIST_HEAD(&per_cpu(slub_free_list, cpu).list); -+ } - - if (debug_guardpage_minorder()) - slub_max_order = 0; --- -2.30.2 - diff --git a/debian/patches-rt/0203-mm-slub-Always-flush-the-delayed-empty-slubs-in-flus.patch b/debian/patches-rt/0203-mm-slub-Always-flush-the-delayed-empty-slubs-in-flus.patch deleted file mode 100644 index 9182905e5..000000000 --- a/debian/patches-rt/0203-mm-slub-Always-flush-the-delayed-empty-slubs-in-flus.patch +++ /dev/null @@ -1,61 +0,0 @@ -From c32cfea8d70c798c232db8cbb3f79bb5b611375c Mon Sep 17 00:00:00 2001 -From: Kevin Hao <haokexin@gmail.com> -Date: Mon, 4 May 2020 11:34:07 +0800 -Subject: [PATCH 203/296] mm: slub: Always flush the delayed empty slubs in - flush_all() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -After commit f0b231101c94 ("mm/SLUB: delay giving back empty slubs to -IRQ enabled regions"), when the free_slab() is invoked with the IRQ -disabled, the empty slubs are moved to a per-CPU list and will be -freed after IRQ enabled later. But in the current codes, there is -a check to see if there really has the cpu slub on a specific cpu -before flushing the delayed empty slubs, this may cause a reference -of already released kmem_cache in a scenario like below: - cpu 0 cpu 1 - kmem_cache_destroy() - flush_all() - --->IPI flush_cpu_slab() - flush_slab() - deactivate_slab() - discard_slab() - free_slab() - c->page = NULL; - for_each_online_cpu(cpu) - if (!has_cpu_slab(1, s)) - continue - this skip to flush the delayed - empty slub released by cpu1 - kmem_cache_free(kmem_cache, s) - - kmalloc() - __slab_alloc() - free_delayed() - __free_slab() - reference to released kmem_cache - -Fixes: f0b231101c94 ("mm/SLUB: delay giving back empty slubs to IRQ enabled regions") -Signed-off-by: Kevin Hao <haokexin@gmail.com> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Cc: stable-rt@vger.kernel.org ---- - mm/slub.c | 3 --- - 1 file changed, 3 deletions(-) - -diff --git a/mm/slub.c b/mm/slub.c -index 8b1abf3277ec..e782944d5799 100644 ---- a/mm/slub.c -+++ b/mm/slub.c -@@ -2497,9 +2497,6 @@ static void flush_all(struct kmem_cache *s) - for_each_online_cpu(cpu) { - struct slub_free_list *f; - -- if (!has_cpu_slab(cpu, s)) -- continue; -- - f = &per_cpu(slub_free_list, cpu); - raw_spin_lock_irq(&f->lock); - list_splice_init(&f->list, &tofree); --- -2.30.2 - diff --git a/debian/patches-rt/0204-mm-slub-Don-t-resize-the-location-tracking-cache-on-.patch b/debian/patches-rt/0204-mm-slub-Don-t-resize-the-location-tracking-cache-on-.patch deleted file mode 100644 index 2dac78379..000000000 --- a/debian/patches-rt/0204-mm-slub-Don-t-resize-the-location-tracking-cache-on-.patch +++ /dev/null @@ -1,37 +0,0 @@ -From bfba19913525a96ee88cb152d32eab7cbd0eaf77 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Fri, 26 Feb 2021 17:26:04 +0100 -Subject: [PATCH 204/296] mm: slub: Don't resize the location tracking cache on - PREEMPT_RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The location tracking cache has a size of a page and is resized if its -current size is too small. -This allocation happens with disabled interrupts and can't happen on -PREEMPT_RT. -Should one page be too small, then we have to allocate more at the -beginning. The only downside is that less callers will be visible. - -Cc: stable-rt@vger.kernel.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - mm/slub.c | 3 +++ - 1 file changed, 3 insertions(+) - -diff --git a/mm/slub.c b/mm/slub.c -index e782944d5799..f5802141ba71 100644 ---- a/mm/slub.c -+++ b/mm/slub.c -@@ -4681,6 +4681,9 @@ static int alloc_loc_track(struct loc_track *t, unsigned long max, gfp_t flags) - struct location *l; - int order; - -+ if (IS_ENABLED(CONFIG_PREEMPT_RT) && flags == GFP_ATOMIC) -+ return 0; -+ - order = get_order(sizeof(struct location) * max); - - l = (void *)__get_free_pages(flags, order); --- -2.30.2 - diff --git a/debian/patches-rt/0205-mm-page_alloc-Use-migrate_disable-in-drain_local_pag.patch b/debian/patches-rt/0205-mm-page_alloc-Use-migrate_disable-in-drain_local_pag.patch deleted file mode 100644 index 52b8af800..000000000 --- a/debian/patches-rt/0205-mm-page_alloc-Use-migrate_disable-in-drain_local_pag.patch +++ /dev/null @@ -1,39 +0,0 @@ -From 64b1879c3821837c907fb42366731b7d7aad78b5 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Thu, 2 Jul 2020 14:27:23 +0200 -Subject: [PATCH 205/296] mm/page_alloc: Use migrate_disable() in - drain_local_pages_wq() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -drain_local_pages_wq() disables preemption to avoid CPU migration during -CPU hotplug. -Using migrate_disable() makes the function preemptible on PREEMPT_RT but -still avoids CPU migrations during CPU-hotplug. On !PREEMPT_RT it -behaves like preempt_disable(). - -Use migrate_disable() in drain_local_pages_wq(). - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - mm/page_alloc.c | 4 ++-- - 1 file changed, 2 insertions(+), 2 deletions(-) - -diff --git a/mm/page_alloc.c b/mm/page_alloc.c -index 84ee66afc232..b8350cbe66f9 100644 ---- a/mm/page_alloc.c -+++ b/mm/page_alloc.c -@@ -3048,9 +3048,9 @@ static void drain_local_pages_wq(struct work_struct *work) - * cpu which is allright but we also have to make sure to not move to - * a different one. - */ -- preempt_disable(); -+ migrate_disable(); - drain_local_pages(drain->zone); -- preempt_enable(); -+ migrate_enable(); - } - - /* --- -2.30.2 - diff --git a/debian/patches-rt/0206-mm-page_alloc-rt-friendly-per-cpu-pages.patch b/debian/patches-rt/0206-mm-page_alloc-rt-friendly-per-cpu-pages.patch deleted file mode 100644 index b45e536ba..000000000 --- a/debian/patches-rt/0206-mm-page_alloc-rt-friendly-per-cpu-pages.patch +++ /dev/null @@ -1,197 +0,0 @@ -From 3e154f65955f6e6664afe437446c1cf8c1409f5d Mon Sep 17 00:00:00 2001 -From: Ingo Molnar <mingo@elte.hu> -Date: Fri, 3 Jul 2009 08:29:37 -0500 -Subject: [PATCH 206/296] mm: page_alloc: rt-friendly per-cpu pages -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -rt-friendly per-cpu pages: convert the irqs-off per-cpu locking -method into a preemptible, explicit-per-cpu-locks method. - -Contains fixes from: - Peter Zijlstra <a.p.zijlstra@chello.nl> - Thomas Gleixner <tglx@linutronix.de> - -Signed-off-by: Ingo Molnar <mingo@elte.hu> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - mm/page_alloc.c | 47 ++++++++++++++++++++++++++++------------------- - 1 file changed, 28 insertions(+), 19 deletions(-) - -diff --git a/mm/page_alloc.c b/mm/page_alloc.c -index b8350cbe66f9..287f3afc3cf1 100644 ---- a/mm/page_alloc.c -+++ b/mm/page_alloc.c -@@ -61,6 +61,7 @@ - #include <linux/hugetlb.h> - #include <linux/sched/rt.h> - #include <linux/sched/mm.h> -+#include <linux/local_lock.h> - #include <linux/page_owner.h> - #include <linux/kthread.h> - #include <linux/memcontrol.h> -@@ -386,6 +387,13 @@ EXPORT_SYMBOL(nr_node_ids); - EXPORT_SYMBOL(nr_online_nodes); - #endif - -+struct pa_lock { -+ local_lock_t l; -+}; -+static DEFINE_PER_CPU(struct pa_lock, pa_lock) = { -+ .l = INIT_LOCAL_LOCK(l), -+}; -+ - int page_group_by_mobility_disabled __read_mostly; - - #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT -@@ -1543,11 +1551,11 @@ static void __free_pages_ok(struct page *page, unsigned int order, - return; - - migratetype = get_pfnblock_migratetype(page, pfn); -- local_irq_save(flags); -+ local_lock_irqsave(&pa_lock.l, flags); - __count_vm_events(PGFREE, 1 << order); - free_one_page(page_zone(page), page, pfn, order, migratetype, - fpi_flags); -- local_irq_restore(flags); -+ local_unlock_irqrestore(&pa_lock.l, flags); - } - - void __free_pages_core(struct page *page, unsigned int order) -@@ -2961,13 +2969,13 @@ void drain_zone_pages(struct zone *zone, struct per_cpu_pages *pcp) - int to_drain, batch; - LIST_HEAD(dst); - -- local_irq_save(flags); -+ local_lock_irqsave(&pa_lock.l, flags); - batch = READ_ONCE(pcp->batch); - to_drain = min(pcp->count, batch); - if (to_drain > 0) - isolate_pcp_pages(to_drain, pcp, &dst); - -- local_irq_restore(flags); -+ local_unlock_irqrestore(&pa_lock.l, flags); - - if (to_drain > 0) - free_pcppages_bulk(zone, &dst, false); -@@ -2989,7 +2997,7 @@ static void drain_pages_zone(unsigned int cpu, struct zone *zone) - LIST_HEAD(dst); - int count; - -- local_irq_save(flags); -+ local_lock_irqsave(&pa_lock.l, flags); - pset = per_cpu_ptr(zone->pageset, cpu); - - pcp = &pset->pcp; -@@ -2997,7 +3005,7 @@ static void drain_pages_zone(unsigned int cpu, struct zone *zone) - if (count) - isolate_pcp_pages(count, pcp, &dst); - -- local_irq_restore(flags); -+ local_unlock_irqrestore(&pa_lock.l, flags); - - if (count) - free_pcppages_bulk(zone, &dst, false); -@@ -3248,9 +3256,9 @@ void free_unref_page(struct page *page) - if (!free_unref_page_prepare(page, pfn)) - return; - -- local_irq_save(flags); -+ local_lock_irqsave(&pa_lock.l, flags); - free_unref_page_commit(page, pfn, &dst); -- local_irq_restore(flags); -+ local_unlock_irqrestore(&pa_lock.l, flags); - if (!list_empty(&dst)) - free_pcppages_bulk(zone, &dst, false); - } -@@ -3277,7 +3285,7 @@ void free_unref_page_list(struct list_head *list) - set_page_private(page, pfn); - } - -- local_irq_save(flags); -+ local_lock_irqsave(&pa_lock.l, flags); - list_for_each_entry_safe(page, next, list, lru) { - unsigned long pfn = page_private(page); - enum zone_type type; -@@ -3292,12 +3300,12 @@ void free_unref_page_list(struct list_head *list) - * a large list of pages to free. - */ - if (++batch_count == SWAP_CLUSTER_MAX) { -- local_irq_restore(flags); -+ local_unlock_irqrestore(&pa_lock.l, flags); - batch_count = 0; -- local_irq_save(flags); -+ local_lock_irqsave(&pa_lock.l, flags); - } - } -- local_irq_restore(flags); -+ local_unlock_irqrestore(&pa_lock.l, flags); - - for (i = 0; i < __MAX_NR_ZONES; ) { - struct page *page; -@@ -3468,7 +3476,7 @@ static struct page *rmqueue_pcplist(struct zone *preferred_zone, - struct page *page; - unsigned long flags; - -- local_irq_save(flags); -+ local_lock_irqsave(&pa_lock.l, flags); - pcp = &this_cpu_ptr(zone->pageset)->pcp; - list = &pcp->lists[migratetype]; - page = __rmqueue_pcplist(zone, migratetype, alloc_flags, pcp, list); -@@ -3476,7 +3484,7 @@ static struct page *rmqueue_pcplist(struct zone *preferred_zone, - __count_zid_vm_events(PGALLOC, page_zonenum(page), 1); - zone_statistics(preferred_zone, zone); - } -- local_irq_restore(flags); -+ local_unlock_irqrestore(&pa_lock.l, flags); - return page; - } - -@@ -3510,7 +3518,8 @@ struct page *rmqueue(struct zone *preferred_zone, - * allocate greater than order-1 page units with __GFP_NOFAIL. - */ - WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1)); -- spin_lock_irqsave(&zone->lock, flags); -+ local_lock_irqsave(&pa_lock.l, flags); -+ spin_lock(&zone->lock); - - do { - page = NULL; -@@ -3536,7 +3545,7 @@ struct page *rmqueue(struct zone *preferred_zone, - - __count_zid_vm_events(PGALLOC, page_zonenum(page), 1 << order); - zone_statistics(preferred_zone, zone); -- local_irq_restore(flags); -+ local_unlock_irqrestore(&pa_lock.l, flags); - - out: - /* Separate test+clear to avoid unnecessary atomics */ -@@ -3549,7 +3558,7 @@ struct page *rmqueue(struct zone *preferred_zone, - return page; - - failed: -- local_irq_restore(flags); -+ local_unlock_irqrestore(&pa_lock.l, flags); - return NULL; - } - -@@ -8794,7 +8803,7 @@ void zone_pcp_reset(struct zone *zone) - struct per_cpu_pageset *pset; - - /* avoid races with drain_pages() */ -- local_irq_save(flags); -+ local_lock_irqsave(&pa_lock.l, flags); - if (zone->pageset != &boot_pageset) { - for_each_online_cpu(cpu) { - pset = per_cpu_ptr(zone->pageset, cpu); -@@ -8803,7 +8812,7 @@ void zone_pcp_reset(struct zone *zone) - free_percpu(zone->pageset); - zone->pageset = &boot_pageset; - } -- local_irq_restore(flags); -+ local_unlock_irqrestore(&pa_lock.l, flags); - } - - #ifdef CONFIG_MEMORY_HOTREMOVE --- -2.30.2 - diff --git a/debian/patches-rt/0207-mm-slub-Make-object_map_lock-a-raw_spinlock_t.patch b/debian/patches-rt/0207-mm-slub-Make-object_map_lock-a-raw_spinlock_t.patch deleted file mode 100644 index 6f5b127e0..000000000 --- a/debian/patches-rt/0207-mm-slub-Make-object_map_lock-a-raw_spinlock_t.patch +++ /dev/null @@ -1,50 +0,0 @@ -From a2586a293712d9de53b87b7eb4069b9c1d49a6f8 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Thu, 16 Jul 2020 18:47:50 +0200 -Subject: [PATCH 207/296] mm/slub: Make object_map_lock a raw_spinlock_t -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The variable object_map is protected by object_map_lock. The lock is always -acquired in debug code and within already atomic context - -Make object_map_lock a raw_spinlock_t. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - mm/slub.c | 6 +++--- - 1 file changed, 3 insertions(+), 3 deletions(-) - -diff --git a/mm/slub.c b/mm/slub.c -index f5802141ba71..65502b516d34 100644 ---- a/mm/slub.c -+++ b/mm/slub.c -@@ -434,7 +434,7 @@ static inline bool cmpxchg_double_slab(struct kmem_cache *s, struct page *page, - - #ifdef CONFIG_SLUB_DEBUG - static unsigned long object_map[BITS_TO_LONGS(MAX_OBJS_PER_PAGE)]; --static DEFINE_SPINLOCK(object_map_lock); -+static DEFINE_RAW_SPINLOCK(object_map_lock); - - /* - * Determine a map of object in use on a page. -@@ -450,7 +450,7 @@ static unsigned long *get_map(struct kmem_cache *s, struct page *page) - - VM_BUG_ON(!irqs_disabled()); - -- spin_lock(&object_map_lock); -+ raw_spin_lock(&object_map_lock); - - bitmap_zero(object_map, page->objects); - -@@ -463,7 +463,7 @@ static unsigned long *get_map(struct kmem_cache *s, struct page *page) - static void put_map(unsigned long *map) __releases(&object_map_lock) - { - VM_BUG_ON(map != object_map); -- spin_unlock(&object_map_lock); -+ raw_spin_unlock(&object_map_lock); - } - - static inline unsigned int size_from_object(struct kmem_cache *s) --- -2.30.2 - diff --git a/debian/patches-rt/0208-slub-Enable-irqs-for-__GFP_WAIT.patch b/debian/patches-rt/0208-slub-Enable-irqs-for-__GFP_WAIT.patch deleted file mode 100644 index efb63d38b..000000000 --- a/debian/patches-rt/0208-slub-Enable-irqs-for-__GFP_WAIT.patch +++ /dev/null @@ -1,76 +0,0 @@ -From 228117cf30801053bf5cb31324a93dfff662fa51 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Wed, 9 Jan 2013 12:08:15 +0100 -Subject: [PATCH 208/296] slub: Enable irqs for __GFP_WAIT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -SYSTEM_RUNNING might be too late for enabling interrupts. Allocations -with GFP_WAIT can happen before that. So use this as an indicator. - -[bigeasy: Add warning on RT for allocations in atomic context. - Don't enable interrupts on allocations during SYSTEM_SUSPEND. This is done - during suspend by ACPI, noticed by Liwei Song <liwei.song@windriver.com> -] - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - mm/slub.c | 18 +++++++++++++++++- - 1 file changed, 17 insertions(+), 1 deletion(-) - -diff --git a/mm/slub.c b/mm/slub.c -index 65502b516d34..b6f4b45f3849 100644 ---- a/mm/slub.c -+++ b/mm/slub.c -@@ -1745,10 +1745,18 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) - void *start, *p, *next; - int idx; - bool shuffle; -+ bool enableirqs = false; - - flags &= gfp_allowed_mask; - - if (gfpflags_allow_blocking(flags)) -+ enableirqs = true; -+ -+#ifdef CONFIG_PREEMPT_RT -+ if (system_state > SYSTEM_BOOTING && system_state < SYSTEM_SUSPEND) -+ enableirqs = true; -+#endif -+ if (enableirqs) - local_irq_enable(); - - flags |= s->allocflags; -@@ -1807,7 +1815,7 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) - page->frozen = 1; - - out: -- if (gfpflags_allow_blocking(flags)) -+ if (enableirqs) - local_irq_disable(); - if (!page) - return NULL; -@@ -2865,6 +2873,10 @@ static __always_inline void *slab_alloc_node(struct kmem_cache *s, - unsigned long tid; - struct obj_cgroup *objcg = NULL; - -+ if (IS_ENABLED(CONFIG_PREEMPT_RT) && IS_ENABLED(CONFIG_DEBUG_ATOMIC_SLEEP)) -+ WARN_ON_ONCE(!preemptible() && -+ (system_state > SYSTEM_BOOTING && system_state < SYSTEM_SUSPEND)); -+ - s = slab_pre_alloc_hook(s, &objcg, 1, gfpflags); - if (!s) - return NULL; -@@ -3331,6 +3343,10 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size, - int i; - struct obj_cgroup *objcg = NULL; - -+ if (IS_ENABLED(CONFIG_PREEMPT_RT) && IS_ENABLED(CONFIG_DEBUG_ATOMIC_SLEEP)) -+ WARN_ON_ONCE(!preemptible() && -+ (system_state > SYSTEM_BOOTING && system_state < SYSTEM_SUSPEND)); -+ - /* memcg and kmem_cache debug support */ - s = slab_pre_alloc_hook(s, &objcg, size, flags); - if (unlikely(!s)) --- -2.30.2 - diff --git a/debian/patches-rt/0209-slub-Disable-SLUB_CPU_PARTIAL.patch b/debian/patches-rt/0209-slub-Disable-SLUB_CPU_PARTIAL.patch deleted file mode 100644 index 35f3971e6..000000000 --- a/debian/patches-rt/0209-slub-Disable-SLUB_CPU_PARTIAL.patch +++ /dev/null @@ -1,54 +0,0 @@ -From 8e7b33e56715ce9d23ac5dddeb5b9389d51201e9 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Wed, 15 Apr 2015 19:00:47 +0200 -Subject: [PATCH 209/296] slub: Disable SLUB_CPU_PARTIAL -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -|BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:915 -|in_atomic(): 1, irqs_disabled(): 0, pid: 87, name: rcuop/7 -|1 lock held by rcuop/7/87: -| #0: (rcu_callback){......}, at: [<ffffffff8112c76a>] rcu_nocb_kthread+0x1ca/0x5d0 -|Preemption disabled at:[<ffffffff811eebd9>] put_cpu_partial+0x29/0x220 -| -|CPU: 0 PID: 87 Comm: rcuop/7 Tainted: G W 4.0.0-rt0+ #477 -|Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 -| 000000000007a9fc ffff88013987baf8 ffffffff817441c7 0000000000000007 -| 0000000000000000 ffff88013987bb18 ffffffff810eee51 0000000000000000 -| ffff88013fc10200 ffff88013987bb48 ffffffff8174a1c4 000000000007a9fc -|Call Trace: -| [<ffffffff817441c7>] dump_stack+0x4f/0x90 -| [<ffffffff810eee51>] ___might_sleep+0x121/0x1b0 -| [<ffffffff8174a1c4>] rt_spin_lock+0x24/0x60 -| [<ffffffff811a689a>] __free_pages_ok+0xaa/0x540 -| [<ffffffff811a729d>] __free_pages+0x1d/0x30 -| [<ffffffff811eddd5>] __free_slab+0xc5/0x1e0 -| [<ffffffff811edf46>] free_delayed+0x56/0x70 -| [<ffffffff811eecfd>] put_cpu_partial+0x14d/0x220 -| [<ffffffff811efc98>] __slab_free+0x158/0x2c0 -| [<ffffffff811f0021>] kmem_cache_free+0x221/0x2d0 -| [<ffffffff81204d0c>] file_free_rcu+0x2c/0x40 -| [<ffffffff8112c7e3>] rcu_nocb_kthread+0x243/0x5d0 -| [<ffffffff810e951c>] kthread+0xfc/0x120 -| [<ffffffff8174abc8>] ret_from_fork+0x58/0x90 - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - init/Kconfig | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/init/Kconfig b/init/Kconfig -index 3fe140f4f0ed..7ba2b602b707 100644 ---- a/init/Kconfig -+++ b/init/Kconfig -@@ -1984,7 +1984,7 @@ config SHUFFLE_PAGE_ALLOCATOR - - config SLUB_CPU_PARTIAL - default y -- depends on SLUB && SMP -+ depends on SLUB && SMP && !PREEMPT_RT - bool "SLUB per cpu partial cache" - help - Per cpu partial caches accelerate objects allocation and freeing --- -2.30.2 - diff --git a/debian/patches-rt/0210-mm-memcontrol-Provide-a-local_lock-for-per-CPU-memcg.patch b/debian/patches-rt/0210-mm-memcontrol-Provide-a-local_lock-for-per-CPU-memcg.patch deleted file mode 100644 index fb4f39586..000000000 --- a/debian/patches-rt/0210-mm-memcontrol-Provide-a-local_lock-for-per-CPU-memcg.patch +++ /dev/null @@ -1,144 +0,0 @@ -From a9f78a61c952ad464ad5362d98068b5d33c994f6 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Tue, 18 Aug 2020 10:30:00 +0200 -Subject: [PATCH 210/296] mm: memcontrol: Provide a local_lock for per-CPU - memcg_stock -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The interrupts are disabled to ensure CPU-local access to the per-CPU -variable `memcg_stock'. -As the code inside the interrupt disabled section acquires regular -spinlocks, which are converted to 'sleeping' spinlocks on a PREEMPT_RT -kernel, this conflicts with the RT semantics. - -Convert it to a local_lock which allows RT kernels to substitute them with -a real per CPU lock. On non RT kernels this maps to local_irq_save() as -before, but provides also lockdep coverage of the critical region. -No functional change. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - mm/memcontrol.c | 31 ++++++++++++++++++------------- - 1 file changed, 18 insertions(+), 13 deletions(-) - -diff --git a/mm/memcontrol.c b/mm/memcontrol.c -index bc99242d3415..dee3b8de569e 100644 ---- a/mm/memcontrol.c -+++ b/mm/memcontrol.c -@@ -2202,6 +2202,7 @@ void unlock_page_memcg(struct page *page) - EXPORT_SYMBOL(unlock_page_memcg); - - struct memcg_stock_pcp { -+ local_lock_t lock; - struct mem_cgroup *cached; /* this never be root cgroup */ - unsigned int nr_pages; - -@@ -2253,7 +2254,7 @@ static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages) - if (nr_pages > MEMCG_CHARGE_BATCH) - return ret; - -- local_irq_save(flags); -+ local_lock_irqsave(&memcg_stock.lock, flags); - - stock = this_cpu_ptr(&memcg_stock); - if (memcg == stock->cached && stock->nr_pages >= nr_pages) { -@@ -2261,7 +2262,7 @@ static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages) - ret = true; - } - -- local_irq_restore(flags); -+ local_unlock_irqrestore(&memcg_stock.lock, flags); - - return ret; - } -@@ -2296,14 +2297,14 @@ static void drain_local_stock(struct work_struct *dummy) - * The only protection from memory hotplug vs. drain_stock races is - * that we always operate on local CPU stock here with IRQ disabled - */ -- local_irq_save(flags); -+ local_lock_irqsave(&memcg_stock.lock, flags); - - stock = this_cpu_ptr(&memcg_stock); - drain_obj_stock(stock); - drain_stock(stock); - clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags); - -- local_irq_restore(flags); -+ local_unlock_irqrestore(&memcg_stock.lock, flags); - } - - /* -@@ -2315,7 +2316,7 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) - struct memcg_stock_pcp *stock; - unsigned long flags; - -- local_irq_save(flags); -+ local_lock_irqsave(&memcg_stock.lock, flags); - - stock = this_cpu_ptr(&memcg_stock); - if (stock->cached != memcg) { /* reset if necessary */ -@@ -2328,7 +2329,7 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) - if (stock->nr_pages > MEMCG_CHARGE_BATCH) - drain_stock(stock); - -- local_irq_restore(flags); -+ local_unlock_irqrestore(&memcg_stock.lock, flags); - } - - /* -@@ -3139,7 +3140,7 @@ static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes) - unsigned long flags; - bool ret = false; - -- local_irq_save(flags); -+ local_lock_irqsave(&memcg_stock.lock, flags); - - stock = this_cpu_ptr(&memcg_stock); - if (objcg == stock->cached_objcg && stock->nr_bytes >= nr_bytes) { -@@ -3147,7 +3148,7 @@ static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes) - ret = true; - } - -- local_irq_restore(flags); -+ local_unlock_irqrestore(&memcg_stock.lock, flags); - - return ret; - } -@@ -3206,7 +3207,7 @@ static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes) - struct memcg_stock_pcp *stock; - unsigned long flags; - -- local_irq_save(flags); -+ local_lock_irqsave(&memcg_stock.lock, flags); - - stock = this_cpu_ptr(&memcg_stock); - if (stock->cached_objcg != objcg) { /* reset if necessary */ -@@ -3220,7 +3221,7 @@ static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes) - if (stock->nr_bytes > PAGE_SIZE) - drain_obj_stock(stock); - -- local_irq_restore(flags); -+ local_unlock_irqrestore(&memcg_stock.lock, flags); - } - - int obj_cgroup_charge(struct obj_cgroup *objcg, gfp_t gfp, size_t size) -@@ -7140,9 +7141,13 @@ static int __init mem_cgroup_init(void) - cpuhp_setup_state_nocalls(CPUHP_MM_MEMCQ_DEAD, "mm/memctrl:dead", NULL, - memcg_hotplug_cpu_dead); - -- for_each_possible_cpu(cpu) -- INIT_WORK(&per_cpu_ptr(&memcg_stock, cpu)->work, -- drain_local_stock); -+ for_each_possible_cpu(cpu) { -+ struct memcg_stock_pcp *stock; -+ -+ stock = per_cpu_ptr(&memcg_stock, cpu); -+ INIT_WORK(&stock->work, drain_local_stock); -+ local_lock_init(&stock->lock); -+ } - - for_each_node(node) { - struct mem_cgroup_tree_per_node *rtpn; --- -2.30.2 - diff --git a/debian/patches-rt/0211-mm-memcontrol-Don-t-call-schedule_work_on-in-preempt.patch b/debian/patches-rt/0211-mm-memcontrol-Don-t-call-schedule_work_on-in-preempt.patch deleted file mode 100644 index f0e70f5ea..000000000 --- a/debian/patches-rt/0211-mm-memcontrol-Don-t-call-schedule_work_on-in-preempt.patch +++ /dev/null @@ -1,75 +0,0 @@ -From 36bdf1cba92d8dbe136a4d5bc011a15450d11434 Mon Sep 17 00:00:00 2001 -From: Yang Shi <yang.shi@windriver.com> -Date: Wed, 30 Oct 2013 11:48:33 -0700 -Subject: [PATCH 211/296] mm/memcontrol: Don't call schedule_work_on in - preemption disabled context -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The following trace is triggered when running ltp oom test cases: - -BUG: sleeping function called from invalid context at kernel/rtmutex.c:659 -in_atomic(): 1, irqs_disabled(): 0, pid: 17188, name: oom03 -Preemption disabled at:[<ffffffff8112ba70>] mem_cgroup_reclaim+0x90/0xe0 - -CPU: 2 PID: 17188 Comm: oom03 Not tainted 3.10.10-rt3 #2 -Hardware name: Intel Corporation Calpella platform/MATXM-CORE-411-B, BIOS 4.6.3 08/18/2010 -ffff88007684d730 ffff880070df9b58 ffffffff8169918d ffff880070df9b70 -ffffffff8106db31 ffff88007688b4a0 ffff880070df9b88 ffffffff8169d9c0 -ffff88007688b4a0 ffff880070df9bc8 ffffffff81059da1 0000000170df9bb0 -Call Trace: -[<ffffffff8169918d>] dump_stack+0x19/0x1b -[<ffffffff8106db31>] __might_sleep+0xf1/0x170 -[<ffffffff8169d9c0>] rt_spin_lock+0x20/0x50 -[<ffffffff81059da1>] queue_work_on+0x61/0x100 -[<ffffffff8112b361>] drain_all_stock+0xe1/0x1c0 -[<ffffffff8112ba70>] mem_cgroup_reclaim+0x90/0xe0 -[<ffffffff8112beda>] __mem_cgroup_try_charge+0x41a/0xc40 -[<ffffffff810f1c91>] ? release_pages+0x1b1/0x1f0 -[<ffffffff8106f200>] ? sched_exec+0x40/0xb0 -[<ffffffff8112cc87>] mem_cgroup_charge_common+0x37/0x70 -[<ffffffff8112e2c6>] mem_cgroup_newpage_charge+0x26/0x30 -[<ffffffff8110af68>] handle_pte_fault+0x618/0x840 -[<ffffffff8103ecf6>] ? unpin_current_cpu+0x16/0x70 -[<ffffffff81070f94>] ? migrate_enable+0xd4/0x200 -[<ffffffff8110cde5>] handle_mm_fault+0x145/0x1e0 -[<ffffffff810301e1>] __do_page_fault+0x1a1/0x4c0 -[<ffffffff8169c9eb>] ? preempt_schedule_irq+0x4b/0x70 -[<ffffffff8169e3b7>] ? retint_kernel+0x37/0x40 -[<ffffffff8103053e>] do_page_fault+0xe/0x10 -[<ffffffff8169e4c2>] page_fault+0x22/0x30 - -So, to prevent schedule_work_on from being called in preempt disabled context, -replace the pair of get/put_cpu() to get/put_cpu_light(). - - -Signed-off-by: Yang Shi <yang.shi@windriver.com> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - mm/memcontrol.c | 4 ++-- - 1 file changed, 2 insertions(+), 2 deletions(-) - -diff --git a/mm/memcontrol.c b/mm/memcontrol.c -index dee3b8de569e..1302658f6f6e 100644 ---- a/mm/memcontrol.c -+++ b/mm/memcontrol.c -@@ -2349,7 +2349,7 @@ static void drain_all_stock(struct mem_cgroup *root_memcg) - * as well as workers from this path always operate on the local - * per-cpu data. CPU up doesn't touch memcg_stock at all. - */ -- curcpu = get_cpu(); -+ curcpu = get_cpu_light(); - for_each_online_cpu(cpu) { - struct memcg_stock_pcp *stock = &per_cpu(memcg_stock, cpu); - struct mem_cgroup *memcg; -@@ -2372,7 +2372,7 @@ static void drain_all_stock(struct mem_cgroup *root_memcg) - schedule_work_on(cpu, &stock->work); - } - } -- put_cpu(); -+ put_cpu_light(); - mutex_unlock(&percpu_charge_mutex); - } - --- -2.30.2 - diff --git a/debian/patches-rt/0212-mm-memcontrol-Replace-local_irq_disable-with-local-l.patch b/debian/patches-rt/0212-mm-memcontrol-Replace-local_irq_disable-with-local-l.patch deleted file mode 100644 index 8deace558..000000000 --- a/debian/patches-rt/0212-mm-memcontrol-Replace-local_irq_disable-with-local-l.patch +++ /dev/null @@ -1,123 +0,0 @@ -From e1d7af76d5591808a459135d5d3838d3033a7cb5 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Wed, 28 Jan 2015 17:14:16 +0100 -Subject: [PATCH 212/296] mm/memcontrol: Replace local_irq_disable with local - locks -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -There are a few local_irq_disable() which then take sleeping locks. This -patch converts them local locks. - -[bigeasy: Move unlock after memcg_check_events() in mem_cgroup_swapout(), - pointed out by Matt Fleming <matt@codeblueprint.co.uk>] -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - mm/memcontrol.c | 29 +++++++++++++++++++++-------- - 1 file changed, 21 insertions(+), 8 deletions(-) - -diff --git a/mm/memcontrol.c b/mm/memcontrol.c -index 1302658f6f6e..4f9cd45aaf50 100644 ---- a/mm/memcontrol.c -+++ b/mm/memcontrol.c -@@ -63,6 +63,7 @@ - #include <net/sock.h> - #include <net/ip.h> - #include "slab.h" -+#include <linux/local_lock.h> - - #include <linux/uaccess.h> - -@@ -93,6 +94,13 @@ bool cgroup_memory_noswap __read_mostly; - static DECLARE_WAIT_QUEUE_HEAD(memcg_cgwb_frn_waitq); - #endif - -+struct event_lock { -+ local_lock_t l; -+}; -+static DEFINE_PER_CPU(struct event_lock, event_lock) = { -+ .l = INIT_LOCAL_LOCK(l), -+}; -+ - /* Whether legacy memory+swap accounting is active */ - static bool do_memsw_account(void) - { -@@ -5726,12 +5734,12 @@ static int mem_cgroup_move_account(struct page *page, - - ret = 0; - -- local_irq_disable(); -+ local_lock_irq(&event_lock.l); - mem_cgroup_charge_statistics(to, page, nr_pages); - memcg_check_events(to, page); - mem_cgroup_charge_statistics(from, page, -nr_pages); - memcg_check_events(from, page); -- local_irq_enable(); -+ local_unlock_irq(&event_lock.l); - out_unlock: - unlock_page(page); - out: -@@ -6801,10 +6809,10 @@ int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask) - css_get(&memcg->css); - commit_charge(page, memcg); - -- local_irq_disable(); -+ local_lock_irq(&event_lock.l); - mem_cgroup_charge_statistics(memcg, page, nr_pages); - memcg_check_events(memcg, page); -- local_irq_enable(); -+ local_unlock_irq(&event_lock.l); - - /* - * Cgroup1's unified memory+swap counter has been charged with the -@@ -6860,11 +6868,11 @@ static void uncharge_batch(const struct uncharge_gather *ug) - memcg_oom_recover(ug->memcg); - } - -- local_irq_save(flags); -+ local_lock_irqsave(&event_lock.l, flags); - __count_memcg_events(ug->memcg, PGPGOUT, ug->pgpgout); - __this_cpu_add(ug->memcg->vmstats_percpu->nr_page_events, ug->nr_pages); - memcg_check_events(ug->memcg, ug->dummy_page); -- local_irq_restore(flags); -+ local_unlock_irqrestore(&event_lock.l, flags); - - /* drop reference from uncharge_page */ - css_put(&ug->memcg->css); -@@ -7018,10 +7026,10 @@ void mem_cgroup_migrate(struct page *oldpage, struct page *newpage) - css_get(&memcg->css); - commit_charge(newpage, memcg); - -- local_irq_save(flags); -+ local_lock_irqsave(&event_lock.l, flags); - mem_cgroup_charge_statistics(memcg, newpage, nr_pages); - memcg_check_events(memcg, newpage); -- local_irq_restore(flags); -+ local_unlock_irqrestore(&event_lock.l, flags); - } - - DEFINE_STATIC_KEY_FALSE(memcg_sockets_enabled_key); -@@ -7196,6 +7204,7 @@ void mem_cgroup_swapout(struct page *page, swp_entry_t entry) - struct mem_cgroup *memcg, *swap_memcg; - unsigned int nr_entries; - unsigned short oldid; -+ unsigned long flags; - - VM_BUG_ON_PAGE(PageLRU(page), page); - VM_BUG_ON_PAGE(page_count(page), page); -@@ -7241,9 +7250,13 @@ void mem_cgroup_swapout(struct page *page, swp_entry_t entry) - * important here to have the interrupts disabled because it is the - * only synchronisation we have for updating the per-CPU variables. - */ -+ local_lock_irqsave(&event_lock.l, flags); -+#ifndef CONFIG_PREEMPT_RT - VM_BUG_ON(!irqs_disabled()); -+#endif - mem_cgroup_charge_statistics(memcg, page, -nr_entries); - memcg_check_events(memcg, page); -+ local_unlock_irqrestore(&event_lock.l, flags); - - css_put(&memcg->css); - } --- -2.30.2 - diff --git a/debian/patches-rt/0214-mm-zswap-Use-local-lock-to-protect-per-CPU-data.patch b/debian/patches-rt/0214-mm-zswap-Use-local-lock-to-protect-per-CPU-data.patch deleted file mode 100644 index a72d683fe..000000000 --- a/debian/patches-rt/0214-mm-zswap-Use-local-lock-to-protect-per-CPU-data.patch +++ /dev/null @@ -1,150 +0,0 @@ -From f8e2f564824ac7900051b0b7c9f088aa8fa6f256 Mon Sep 17 00:00:00 2001 -From: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com> -Date: Tue, 25 Jun 2019 11:28:04 -0300 -Subject: [PATCH 214/296] mm/zswap: Use local lock to protect per-CPU data -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -zwap uses per-CPU compression. The per-CPU data pointer is acquired with -get_cpu_ptr() which implicitly disables preemption. It allocates -memory inside the preempt disabled region which conflicts with the -PREEMPT_RT semantics. - -Replace the implicit preemption control with an explicit local lock. -This allows RT kernels to substitute it with a real per CPU lock, which -serializes the access but keeps the code section preemptible. On non RT -kernels this maps to preempt_disable() as before, i.e. no functional -change. - -[bigeasy: Use local_lock(), additional hunks, patch description] - -Cc: Seth Jennings <sjenning@redhat.com> -Cc: Dan Streetman <ddstreet@ieee.org> -Cc: Vitaly Wool <vitaly.wool@konsulko.com> -Cc: Andrew Morton <akpm@linux-foundation.org> -Cc: linux-mm@kvack.org -Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - mm/zswap.c | 43 ++++++++++++++++++++++++++++--------------- - 1 file changed, 28 insertions(+), 15 deletions(-) - -diff --git a/mm/zswap.c b/mm/zswap.c -index fbb782924ccc..b24f761b9241 100644 ---- a/mm/zswap.c -+++ b/mm/zswap.c -@@ -18,6 +18,7 @@ - #include <linux/highmem.h> - #include <linux/slab.h> - #include <linux/spinlock.h> -+#include <linux/local_lock.h> - #include <linux/types.h> - #include <linux/atomic.h> - #include <linux/frontswap.h> -@@ -387,27 +388,37 @@ static struct zswap_entry *zswap_entry_find_get(struct rb_root *root, - /********************************* - * per-cpu code - **********************************/ --static DEFINE_PER_CPU(u8 *, zswap_dstmem); -+struct zswap_comp { -+ /* Used for per-CPU dstmem and tfm */ -+ local_lock_t lock; -+ u8 *dstmem; -+}; -+ -+static DEFINE_PER_CPU(struct zswap_comp, zswap_comp) = { -+ .lock = INIT_LOCAL_LOCK(lock), -+}; - - static int zswap_dstmem_prepare(unsigned int cpu) - { -+ struct zswap_comp *zcomp; - u8 *dst; - - dst = kmalloc_node(PAGE_SIZE * 2, GFP_KERNEL, cpu_to_node(cpu)); - if (!dst) - return -ENOMEM; - -- per_cpu(zswap_dstmem, cpu) = dst; -+ zcomp = per_cpu_ptr(&zswap_comp, cpu); -+ zcomp->dstmem = dst; - return 0; - } - - static int zswap_dstmem_dead(unsigned int cpu) - { -- u8 *dst; -+ struct zswap_comp *zcomp; - -- dst = per_cpu(zswap_dstmem, cpu); -- kfree(dst); -- per_cpu(zswap_dstmem, cpu) = NULL; -+ zcomp = per_cpu_ptr(&zswap_comp, cpu); -+ kfree(zcomp->dstmem); -+ zcomp->dstmem = NULL; - - return 0; - } -@@ -919,10 +930,11 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle) - dlen = PAGE_SIZE; - src = (u8 *)zhdr + sizeof(struct zswap_header); - dst = kmap_atomic(page); -- tfm = *get_cpu_ptr(entry->pool->tfm); -+ local_lock(&zswap_comp.lock); -+ tfm = *this_cpu_ptr(entry->pool->tfm); - ret = crypto_comp_decompress(tfm, src, entry->length, - dst, &dlen); -- put_cpu_ptr(entry->pool->tfm); -+ local_unlock(&zswap_comp.lock); - kunmap_atomic(dst); - BUG_ON(ret); - BUG_ON(dlen != PAGE_SIZE); -@@ -1074,12 +1086,12 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset, - } - - /* compress */ -- dst = get_cpu_var(zswap_dstmem); -- tfm = *get_cpu_ptr(entry->pool->tfm); -+ local_lock(&zswap_comp.lock); -+ dst = *this_cpu_ptr(&zswap_comp.dstmem); -+ tfm = *this_cpu_ptr(entry->pool->tfm); - src = kmap_atomic(page); - ret = crypto_comp_compress(tfm, src, PAGE_SIZE, dst, &dlen); - kunmap_atomic(src); -- put_cpu_ptr(entry->pool->tfm); - if (ret) { - ret = -EINVAL; - goto put_dstmem; -@@ -1103,7 +1115,7 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset, - memcpy(buf, &zhdr, hlen); - memcpy(buf + hlen, dst, dlen); - zpool_unmap_handle(entry->pool->zpool, handle); -- put_cpu_var(zswap_dstmem); -+ local_unlock(&zswap_comp.lock); - - /* populate entry */ - entry->offset = offset; -@@ -1131,7 +1143,7 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset, - return 0; - - put_dstmem: -- put_cpu_var(zswap_dstmem); -+ local_unlock(&zswap_comp.lock); - zswap_pool_put(entry->pool); - freepage: - zswap_entry_cache_free(entry); -@@ -1176,9 +1188,10 @@ static int zswap_frontswap_load(unsigned type, pgoff_t offset, - if (zpool_evictable(entry->pool->zpool)) - src += sizeof(struct zswap_header); - dst = kmap_atomic(page); -- tfm = *get_cpu_ptr(entry->pool->tfm); -+ local_lock(&zswap_comp.lock); -+ tfm = *this_cpu_ptr(entry->pool->tfm); - ret = crypto_comp_decompress(tfm, src, entry->length, dst, &dlen); -- put_cpu_ptr(entry->pool->tfm); -+ local_unlock(&zswap_comp.lock); - kunmap_atomic(dst); - zpool_unmap_handle(entry->pool->zpool, entry->handle); - BUG_ON(ret); --- -2.30.2 - diff --git a/debian/patches-rt/0216-wait.h-include-atomic.h.patch b/debian/patches-rt/0216-wait.h-include-atomic.h.patch deleted file mode 100644 index 7ec901ac5..000000000 --- a/debian/patches-rt/0216-wait.h-include-atomic.h.patch +++ /dev/null @@ -1,42 +0,0 @@ -From 5e394b3bdbd79a62a1d1470fe383d2e1090fba7f Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Mon, 28 Oct 2013 12:19:57 +0100 -Subject: [PATCH 216/296] wait.h: include atomic.h -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -| CC init/main.o -|In file included from include/linux/mmzone.h:9:0, -| from include/linux/gfp.h:4, -| from include/linux/kmod.h:22, -| from include/linux/module.h:13, -| from init/main.c:15: -|include/linux/wait.h: In function ‘wait_on_atomic_t’: -|include/linux/wait.h:982:2: error: implicit declaration of function ‘atomic_read’ [-Werror=implicit-function-declaration] -| if (atomic_read(val) == 0) -| ^ - -This pops up on ARM. Non-RT gets its atomic.h include from spinlock.h - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/wait.h | 1 + - 1 file changed, 1 insertion(+) - -diff --git a/include/linux/wait.h b/include/linux/wait.h -index 27fb99cfeb02..93b42387b4c6 100644 ---- a/include/linux/wait.h -+++ b/include/linux/wait.h -@@ -10,6 +10,7 @@ - - #include <asm/current.h> - #include <uapi/linux/wait.h> -+#include <linux/atomic.h> - - typedef struct wait_queue_entry wait_queue_entry_t; - --- -2.30.2 - diff --git a/debian/patches-rt/0217-sched-Limit-the-number-of-task-migrations-per-batch.patch b/debian/patches-rt/0217-sched-Limit-the-number-of-task-migrations-per-batch.patch deleted file mode 100644 index 135ba6ed9..000000000 --- a/debian/patches-rt/0217-sched-Limit-the-number-of-task-migrations-per-batch.patch +++ /dev/null @@ -1,33 +0,0 @@ -From fd05866a1422f832ca4e09ae0b1866d7576ca895 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Mon, 6 Jun 2011 12:12:51 +0200 -Subject: [PATCH 217/296] sched: Limit the number of task migrations per batch -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Put an upper limit on the number of tasks which are migrated per batch -to avoid large latencies. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - kernel/sched/core.c | 4 ++++ - 1 file changed, 4 insertions(+) - -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index a7005a5dea68..67514da5df20 100644 ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -64,7 +64,11 @@ const_debug unsigned int sysctl_sched_features = - * Number of tasks to iterate in a single balance run. - * Limited because this is done with IRQs disabled. - */ -+#ifdef CONFIG_PREEMPT_RT -+const_debug unsigned int sysctl_sched_nr_migrate = 8; -+#else - const_debug unsigned int sysctl_sched_nr_migrate = 32; -+#endif - - /* - * period over which we measure -rt task CPU usage in us. --- -2.30.2 - diff --git a/debian/patches-rt/0218-sched-Move-mmdrop-to-RCU-on-RT.patch b/debian/patches-rt/0218-sched-Move-mmdrop-to-RCU-on-RT.patch deleted file mode 100644 index 5c726a474..000000000 --- a/debian/patches-rt/0218-sched-Move-mmdrop-to-RCU-on-RT.patch +++ /dev/null @@ -1,115 +0,0 @@ -From 0978887259d2a9433d9a5b08f28886aa9cf4d94f Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Mon, 6 Jun 2011 12:20:33 +0200 -Subject: [PATCH 218/296] sched: Move mmdrop to RCU on RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Takes sleeping locks and calls into the memory allocator, so nothing -we want to do in task switch and oder atomic contexts. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - include/linux/mm_types.h | 4 ++++ - include/linux/sched/mm.h | 11 +++++++++++ - kernel/fork.c | 13 +++++++++++++ - kernel/sched/core.c | 7 ++++++- - 4 files changed, 34 insertions(+), 1 deletion(-) - -diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h -index 3433ecc9c1f7..02649396954b 100644 ---- a/include/linux/mm_types.h -+++ b/include/linux/mm_types.h -@@ -12,6 +12,7 @@ - #include <linux/completion.h> - #include <linux/cpumask.h> - #include <linux/uprobes.h> -+#include <linux/rcupdate.h> - #include <linux/page-flags-layout.h> - #include <linux/workqueue.h> - #include <linux/seqlock.h> -@@ -557,6 +558,9 @@ struct mm_struct { - bool tlb_flush_batched; - #endif - struct uprobes_state uprobes_state; -+#ifdef CONFIG_PREEMPT_RT -+ struct rcu_head delayed_drop; -+#endif - #ifdef CONFIG_HUGETLB_PAGE - atomic_long_t hugetlb_usage; - #endif -diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h -index dc1f4dcd9a82..9796cc213b74 100644 ---- a/include/linux/sched/mm.h -+++ b/include/linux/sched/mm.h -@@ -49,6 +49,17 @@ static inline void mmdrop(struct mm_struct *mm) - __mmdrop(mm); - } - -+#ifdef CONFIG_PREEMPT_RT -+extern void __mmdrop_delayed(struct rcu_head *rhp); -+static inline void mmdrop_delayed(struct mm_struct *mm) -+{ -+ if (atomic_dec_and_test(&mm->mm_count)) -+ call_rcu(&mm->delayed_drop, __mmdrop_delayed); -+} -+#else -+# define mmdrop_delayed(mm) mmdrop(mm) -+#endif -+ - /** - * mmget() - Pin the address space associated with a &struct mm_struct. - * @mm: The address space to pin. -diff --git a/kernel/fork.c b/kernel/fork.c -index e903347d1bc1..8624090a7440 100644 ---- a/kernel/fork.c -+++ b/kernel/fork.c -@@ -688,6 +688,19 @@ void __mmdrop(struct mm_struct *mm) - } - EXPORT_SYMBOL_GPL(__mmdrop); - -+#ifdef CONFIG_PREEMPT_RT -+/* -+ * RCU callback for delayed mm drop. Not strictly rcu, but we don't -+ * want another facility to make this work. -+ */ -+void __mmdrop_delayed(struct rcu_head *rhp) -+{ -+ struct mm_struct *mm = container_of(rhp, struct mm_struct, delayed_drop); -+ -+ __mmdrop(mm); -+} -+#endif -+ - static void mmdrop_async_fn(struct work_struct *work) - { - struct mm_struct *mm; -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index 67514da5df20..bb7948ad463d 100644 ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -4245,9 +4245,13 @@ static struct rq *finish_task_switch(struct task_struct *prev) - * provided by mmdrop(), - * - a sync_core for SYNC_CORE. - */ -+ /* -+ * We use mmdrop_delayed() here so we don't have to do the -+ * full __mmdrop() when we are the last user. -+ */ - if (mm) { - membarrier_mm_sync_core_before_usermode(mm); -- mmdrop(mm); -+ mmdrop_delayed(mm); - } - if (unlikely(prev_state == TASK_DEAD)) { - if (prev->sched_class->task_dead) -@@ -7262,6 +7266,7 @@ void sched_setnuma(struct task_struct *p, int nid) - #endif /* CONFIG_NUMA_BALANCING */ - - #ifdef CONFIG_HOTPLUG_CPU -+ - /* - * Ensure that the idle task is using init_mm right before its CPU goes - * offline. --- -2.30.2 - diff --git a/debian/patches-rt/0219-kernel-sched-move-stack-kprobe-clean-up-to-__put_tas.patch b/debian/patches-rt/0219-kernel-sched-move-stack-kprobe-clean-up-to-__put_tas.patch deleted file mode 100644 index 776e38585..000000000 --- a/debian/patches-rt/0219-kernel-sched-move-stack-kprobe-clean-up-to-__put_tas.patch +++ /dev/null @@ -1,81 +0,0 @@ -From 85525b7cda178540de49b14962103a67b42944e4 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Mon, 21 Nov 2016 19:31:08 +0100 -Subject: [PATCH 219/296] kernel/sched: move stack + kprobe clean up to - __put_task_struct() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -There is no need to free the stack before the task struct (except for reasons -mentioned in commit 68f24b08ee89 ("sched/core: Free the stack early if -CONFIG_THREAD_INFO_IN_TASK")). This also comes handy on -RT because we can't -free memory in preempt disabled region. -vfree_atomic() delays the memory cleanup to a worker. Since we move everything -to the RCU callback, we can also free it immediately. - -Cc: stable-rt@vger.kernel.org #for kprobe_flush_task() -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/fork.c | 12 +++++++++++- - kernel/sched/core.c | 9 --------- - 2 files changed, 11 insertions(+), 10 deletions(-) - -diff --git a/kernel/fork.c b/kernel/fork.c -index 8624090a7440..eaa47c928f96 100644 ---- a/kernel/fork.c -+++ b/kernel/fork.c -@@ -42,6 +42,7 @@ - #include <linux/mmu_notifier.h> - #include <linux/fs.h> - #include <linux/mm.h> -+#include <linux/kprobes.h> - #include <linux/vmacache.h> - #include <linux/nsproxy.h> - #include <linux/capability.h> -@@ -288,7 +289,7 @@ static inline void free_thread_stack(struct task_struct *tsk) - return; - } - -- vfree_atomic(tsk->stack); -+ vfree(tsk->stack); - return; - } - #endif -@@ -742,6 +743,15 @@ void __put_task_struct(struct task_struct *tsk) - WARN_ON(refcount_read(&tsk->usage)); - WARN_ON(tsk == current); - -+ /* -+ * Remove function-return probe instances associated with this -+ * task and put them back on the free list. -+ */ -+ kprobe_flush_task(tsk); -+ -+ /* Task is done with its stack. */ -+ put_task_stack(tsk); -+ - io_uring_free(tsk); - cgroup_free(tsk); - task_numa_free(tsk, true); -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index bb7948ad463d..449343e31e72 100644 ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -4257,15 +4257,6 @@ static struct rq *finish_task_switch(struct task_struct *prev) - if (prev->sched_class->task_dead) - prev->sched_class->task_dead(prev); - -- /* -- * Remove function-return probe instances associated with this -- * task and put them back on the free list. -- */ -- kprobe_flush_task(prev); -- -- /* Task is done with its stack. */ -- put_task_stack(prev); -- - put_task_struct_rcu_user(prev); - } - --- -2.30.2 - diff --git a/debian/patches-rt/0220-sched-Do-not-account-rcu_preempt_depth-on-RT-in-migh.patch b/debian/patches-rt/0220-sched-Do-not-account-rcu_preempt_depth-on-RT-in-migh.patch deleted file mode 100644 index fabd590de..000000000 --- a/debian/patches-rt/0220-sched-Do-not-account-rcu_preempt_depth-on-RT-in-migh.patch +++ /dev/null @@ -1,57 +0,0 @@ -From a4c7100f6d2ef4a46dd828a673e51761474d10b3 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 7 Jun 2011 09:19:06 +0200 -Subject: [PATCH 220/296] sched: Do not account rcu_preempt_depth on RT in - might_sleep() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -RT changes the rcu_preempt_depth semantics, so we cannot check for it -in might_sleep(). - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - include/linux/rcupdate.h | 7 +++++++ - kernel/sched/core.c | 2 +- - 2 files changed, 8 insertions(+), 1 deletion(-) - -diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h -index fc5ba83fc818..f251ba473f77 100644 ---- a/include/linux/rcupdate.h -+++ b/include/linux/rcupdate.h -@@ -52,6 +52,11 @@ void __rcu_read_unlock(void); - * types of kernel builds, the rcu_read_lock() nesting depth is unknowable. - */ - #define rcu_preempt_depth() (current->rcu_read_lock_nesting) -+#ifndef CONFIG_PREEMPT_RT -+#define sched_rcu_preempt_depth() rcu_preempt_depth() -+#else -+static inline int sched_rcu_preempt_depth(void) { return 0; } -+#endif - - #else /* #ifdef CONFIG_PREEMPT_RCU */ - -@@ -77,6 +82,8 @@ static inline int rcu_preempt_depth(void) - return 0; - } - -+#define sched_rcu_preempt_depth() rcu_preempt_depth() -+ - #endif /* #else #ifdef CONFIG_PREEMPT_RCU */ - - /* Internal to kernel */ -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index 449343e31e72..3e6aaf6161f7 100644 ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -7874,7 +7874,7 @@ void __init sched_init(void) - #ifdef CONFIG_DEBUG_ATOMIC_SLEEP - static inline int preempt_count_equals(int preempt_offset) - { -- int nested = preempt_count() + rcu_preempt_depth(); -+ int nested = preempt_count() + sched_rcu_preempt_depth(); - - return (nested == preempt_offset); - } --- -2.30.2 - diff --git a/debian/patches-rt/0221-sched-Disable-TTWU_QUEUE-on-RT.patch b/debian/patches-rt/0221-sched-Disable-TTWU_QUEUE-on-RT.patch deleted file mode 100644 index 8114bb240..000000000 --- a/debian/patches-rt/0221-sched-Disable-TTWU_QUEUE-on-RT.patch +++ /dev/null @@ -1,38 +0,0 @@ -From 8d7cea5c91830c670105c8e7dca7b3093f5306b8 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 13 Sep 2011 16:42:35 +0200 -Subject: [PATCH 221/296] sched: Disable TTWU_QUEUE on RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The queued remote wakeup mechanism can introduce rather large -latencies if the number of migrated tasks is high. Disable it for RT. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - kernel/sched/features.h | 5 +++++ - 1 file changed, 5 insertions(+) - -diff --git a/kernel/sched/features.h b/kernel/sched/features.h -index 68d369cba9e4..296aea55f44c 100644 ---- a/kernel/sched/features.h -+++ b/kernel/sched/features.h -@@ -45,11 +45,16 @@ SCHED_FEAT(DOUBLE_TICK, false) - */ - SCHED_FEAT(NONTASK_CAPACITY, true) - -+#ifdef CONFIG_PREEMPT_RT -+SCHED_FEAT(TTWU_QUEUE, false) -+#else -+ - /* - * Queue remote wakeups on the target CPU and process them - * using the scheduler IPI. Reduces rq->lock contention/bounces. - */ - SCHED_FEAT(TTWU_QUEUE, true) -+#endif - - /* - * When doing wakeups, attempt to limit superfluous scans of the LLC domain. --- -2.30.2 - diff --git a/debian/patches-rt/0222-softirq-Check-preemption-after-reenabling-interrupts.patch b/debian/patches-rt/0222-softirq-Check-preemption-after-reenabling-interrupts.patch deleted file mode 100644 index 6c0d2a62d..000000000 --- a/debian/patches-rt/0222-softirq-Check-preemption-after-reenabling-interrupts.patch +++ /dev/null @@ -1,151 +0,0 @@ -From b9518d5196670c6ba5c1f67d07e8c9aa53df4384 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Sun, 13 Nov 2011 17:17:09 +0100 -Subject: [PATCH 222/296] softirq: Check preemption after reenabling interrupts -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -raise_softirq_irqoff() disables interrupts and wakes the softirq -daemon, but after reenabling interrupts there is no preemption check, -so the execution of the softirq thread might be delayed arbitrarily. - -In principle we could add that check to local_irq_enable/restore, but -that's overkill as the rasie_softirq_irqoff() sections are the only -ones which show this behaviour. - -Reported-by: Carsten Emde <cbe@osadl.org> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - include/linux/preempt.h | 3 +++ - lib/irq_poll.c | 5 +++++ - net/core/dev.c | 7 +++++++ - 3 files changed, 15 insertions(+) - -diff --git a/include/linux/preempt.h b/include/linux/preempt.h -index 5ceac863e729..fb140e00f74d 100644 ---- a/include/linux/preempt.h -+++ b/include/linux/preempt.h -@@ -190,8 +190,10 @@ do { \ - - #ifdef CONFIG_PREEMPT_RT - # define preempt_enable_no_resched() sched_preempt_enable_no_resched() -+# define preempt_check_resched_rt() preempt_check_resched() - #else - # define preempt_enable_no_resched() preempt_enable() -+# define preempt_check_resched_rt() barrier(); - #endif - - #define preemptible() (preempt_count() == 0 && !irqs_disabled()) -@@ -262,6 +264,7 @@ do { \ - #define preempt_disable_notrace() barrier() - #define preempt_enable_no_resched_notrace() barrier() - #define preempt_enable_notrace() barrier() -+#define preempt_check_resched_rt() barrier() - #define preemptible() 0 - - #endif /* CONFIG_PREEMPT_COUNT */ -diff --git a/lib/irq_poll.c b/lib/irq_poll.c -index 2f17b488d58e..7557bf7ecf1f 100644 ---- a/lib/irq_poll.c -+++ b/lib/irq_poll.c -@@ -37,6 +37,7 @@ void irq_poll_sched(struct irq_poll *iop) - list_add_tail(&iop->list, this_cpu_ptr(&blk_cpu_iopoll)); - raise_softirq_irqoff(IRQ_POLL_SOFTIRQ); - local_irq_restore(flags); -+ preempt_check_resched_rt(); - } - EXPORT_SYMBOL(irq_poll_sched); - -@@ -72,6 +73,7 @@ void irq_poll_complete(struct irq_poll *iop) - local_irq_save(flags); - __irq_poll_complete(iop); - local_irq_restore(flags); -+ preempt_check_resched_rt(); - } - EXPORT_SYMBOL(irq_poll_complete); - -@@ -96,6 +98,7 @@ static void __latent_entropy irq_poll_softirq(struct softirq_action *h) - } - - local_irq_enable(); -+ preempt_check_resched_rt(); - - /* Even though interrupts have been re-enabled, this - * access is safe because interrupts can only add new -@@ -133,6 +136,7 @@ static void __latent_entropy irq_poll_softirq(struct softirq_action *h) - __raise_softirq_irqoff(IRQ_POLL_SOFTIRQ); - - local_irq_enable(); -+ preempt_check_resched_rt(); - } - - /** -@@ -196,6 +200,7 @@ static int irq_poll_cpu_dead(unsigned int cpu) - this_cpu_ptr(&blk_cpu_iopoll)); - __raise_softirq_irqoff(IRQ_POLL_SOFTIRQ); - local_irq_enable(); -+ preempt_check_resched_rt(); - - return 0; - } -diff --git a/net/core/dev.c b/net/core/dev.c -index 64f4c7ec729d..0b88c6154739 100644 ---- a/net/core/dev.c -+++ b/net/core/dev.c -@@ -3048,6 +3048,7 @@ static void __netif_reschedule(struct Qdisc *q) - sd->output_queue_tailp = &q->next_sched; - raise_softirq_irqoff(NET_TX_SOFTIRQ); - local_irq_restore(flags); -+ preempt_check_resched_rt(); - } - - void __netif_schedule(struct Qdisc *q) -@@ -3110,6 +3111,7 @@ void __dev_kfree_skb_irq(struct sk_buff *skb, enum skb_free_reason reason) - __this_cpu_write(softnet_data.completion_queue, skb); - raise_softirq_irqoff(NET_TX_SOFTIRQ); - local_irq_restore(flags); -+ preempt_check_resched_rt(); - } - EXPORT_SYMBOL(__dev_kfree_skb_irq); - -@@ -4572,6 +4574,7 @@ static int enqueue_to_backlog(struct sk_buff *skb, int cpu, - rps_unlock(sd); - - local_irq_restore(flags); -+ preempt_check_resched_rt(); - - atomic_long_inc(&skb->dev->rx_dropped); - kfree_skb(skb); -@@ -6291,12 +6294,14 @@ static void net_rps_action_and_irq_enable(struct softnet_data *sd) - sd->rps_ipi_list = NULL; - - local_irq_enable(); -+ preempt_check_resched_rt(); - - /* Send pending IPI's to kick RPS processing on remote cpus. */ - net_rps_send_ipi(remsd); - } else - #endif - local_irq_enable(); -+ preempt_check_resched_rt(); - } - - static bool sd_has_rps_ipi_waiting(struct softnet_data *sd) -@@ -6374,6 +6379,7 @@ void __napi_schedule(struct napi_struct *n) - local_irq_save(flags); - ____napi_schedule(this_cpu_ptr(&softnet_data), n); - local_irq_restore(flags); -+ preempt_check_resched_rt(); - } - EXPORT_SYMBOL(__napi_schedule); - -@@ -10905,6 +10911,7 @@ static int dev_cpu_dead(unsigned int oldcpu) - - raise_softirq_irqoff(NET_TX_SOFTIRQ); - local_irq_enable(); -+ preempt_check_resched_rt(); - - #ifdef CONFIG_RPS - remsd = oldsd->rps_ipi_list; --- -2.30.2 - diff --git a/debian/patches-rt/0223-softirq-Disable-softirq-stacks-for-RT.patch b/debian/patches-rt/0223-softirq-Disable-softirq-stacks-for-RT.patch deleted file mode 100644 index 78df4b73e..000000000 --- a/debian/patches-rt/0223-softirq-Disable-softirq-stacks-for-RT.patch +++ /dev/null @@ -1,168 +0,0 @@ -From 32619a9db436ba2bd525bab289356a2c85668c7e Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Mon, 18 Jul 2011 13:59:17 +0200 -Subject: [PATCH 223/296] softirq: Disable softirq stacks for RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Disable extra stacks for softirqs. We want to preempt softirqs and -having them on special IRQ-stack does not make this easier. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - arch/powerpc/kernel/irq.c | 2 ++ - arch/powerpc/kernel/misc_32.S | 2 ++ - arch/powerpc/kernel/misc_64.S | 2 ++ - arch/sh/kernel/irq.c | 2 ++ - arch/sparc/kernel/irq_64.c | 2 ++ - arch/x86/kernel/irq_32.c | 2 ++ - arch/x86/kernel/irq_64.c | 2 ++ - include/linux/interrupt.h | 2 +- - 8 files changed, 15 insertions(+), 1 deletion(-) - -diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c -index e8a548447dd6..5ad4f27cba10 100644 ---- a/arch/powerpc/kernel/irq.c -+++ b/arch/powerpc/kernel/irq.c -@@ -753,10 +753,12 @@ void *mcheckirq_ctx[NR_CPUS] __read_mostly; - void *softirq_ctx[NR_CPUS] __read_mostly; - void *hardirq_ctx[NR_CPUS] __read_mostly; - -+#ifndef CONFIG_PREEMPT_RT - void do_softirq_own_stack(void) - { - call_do_softirq(softirq_ctx[smp_processor_id()]); - } -+#endif - - irq_hw_number_t virq_to_hw(unsigned int virq) - { -diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S -index 717e658b90fd..08ee95ad6593 100644 ---- a/arch/powerpc/kernel/misc_32.S -+++ b/arch/powerpc/kernel/misc_32.S -@@ -31,6 +31,7 @@ - * We store the saved ksp_limit in the unused part - * of the STACK_FRAME_OVERHEAD - */ -+#ifndef CONFIG_PREEMPT_RT - _GLOBAL(call_do_softirq) - mflr r0 - stw r0,4(r1) -@@ -46,6 +47,7 @@ _GLOBAL(call_do_softirq) - stw r10,THREAD+KSP_LIMIT(r2) - mtlr r0 - blr -+#endif - - /* - * void call_do_irq(struct pt_regs *regs, void *sp); -diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S -index 070465825c21..a6b33f7b3264 100644 ---- a/arch/powerpc/kernel/misc_64.S -+++ b/arch/powerpc/kernel/misc_64.S -@@ -27,6 +27,7 @@ - - .text - -+#ifndef CONFIG_PREEMPT_RT - _GLOBAL(call_do_softirq) - mflr r0 - std r0,16(r1) -@@ -37,6 +38,7 @@ _GLOBAL(call_do_softirq) - ld r0,16(r1) - mtlr r0 - blr -+#endif - - _GLOBAL(call_do_irq) - mflr r0 -diff --git a/arch/sh/kernel/irq.c b/arch/sh/kernel/irq.c -index ab5f790b0cd2..5db7af565dec 100644 ---- a/arch/sh/kernel/irq.c -+++ b/arch/sh/kernel/irq.c -@@ -148,6 +148,7 @@ void irq_ctx_exit(int cpu) - hardirq_ctx[cpu] = NULL; - } - -+#ifndef CONFIG_PREEMPT_RT - void do_softirq_own_stack(void) - { - struct thread_info *curctx; -@@ -175,6 +176,7 @@ void do_softirq_own_stack(void) - "r5", "r6", "r7", "r8", "r9", "r15", "t", "pr" - ); - } -+#endif - #else - static inline void handle_one_irq(unsigned int irq) - { -diff --git a/arch/sparc/kernel/irq_64.c b/arch/sparc/kernel/irq_64.c -index 3ec9f1402aad..eb21682abfcb 100644 ---- a/arch/sparc/kernel/irq_64.c -+++ b/arch/sparc/kernel/irq_64.c -@@ -854,6 +854,7 @@ void __irq_entry handler_irq(int pil, struct pt_regs *regs) - set_irq_regs(old_regs); - } - -+#ifndef CONFIG_PREEMPT_RT - void do_softirq_own_stack(void) - { - void *orig_sp, *sp = softirq_stack[smp_processor_id()]; -@@ -868,6 +869,7 @@ void do_softirq_own_stack(void) - __asm__ __volatile__("mov %0, %%sp" - : : "r" (orig_sp)); - } -+#endif - - #ifdef CONFIG_HOTPLUG_CPU - void fixup_irqs(void) -diff --git a/arch/x86/kernel/irq_32.c b/arch/x86/kernel/irq_32.c -index 0b79efc87be5..93c6b88b382a 100644 ---- a/arch/x86/kernel/irq_32.c -+++ b/arch/x86/kernel/irq_32.c -@@ -131,6 +131,7 @@ int irq_init_percpu_irqstack(unsigned int cpu) - return 0; - } - -+#ifndef CONFIG_PREEMPT_RT - void do_softirq_own_stack(void) - { - struct irq_stack *irqstk; -@@ -147,6 +148,7 @@ void do_softirq_own_stack(void) - - call_on_stack(__do_softirq, isp); - } -+#endif - - void __handle_irq(struct irq_desc *desc, struct pt_regs *regs) - { -diff --git a/arch/x86/kernel/irq_64.c b/arch/x86/kernel/irq_64.c -index 440eed558558..7cfc4e6b7c94 100644 ---- a/arch/x86/kernel/irq_64.c -+++ b/arch/x86/kernel/irq_64.c -@@ -72,7 +72,9 @@ int irq_init_percpu_irqstack(unsigned int cpu) - return map_irq_stack(cpu); - } - -+#ifndef CONFIG_PREEMPT_RT - void do_softirq_own_stack(void) - { - run_on_irqstack_cond(__do_softirq, NULL); - } -+#endif -diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h -index e58f9b0650d3..7545a2f18560 100644 ---- a/include/linux/interrupt.h -+++ b/include/linux/interrupt.h -@@ -560,7 +560,7 @@ struct softirq_action - asmlinkage void do_softirq(void); - asmlinkage void __do_softirq(void); - --#ifdef __ARCH_HAS_DO_SOFTIRQ -+#if defined(__ARCH_HAS_DO_SOFTIRQ) && !defined(CONFIG_PREEMPT_RT) - void do_softirq_own_stack(void); - #else - static inline void do_softirq_own_stack(void) --- -2.30.2 - diff --git a/debian/patches-rt/0225-pid.h-include-atomic.h.patch b/debian/patches-rt/0225-pid.h-include-atomic.h.patch deleted file mode 100644 index 29ecee29b..000000000 --- a/debian/patches-rt/0225-pid.h-include-atomic.h.patch +++ /dev/null @@ -1,43 +0,0 @@ -From 9e2bf53f5a42741cca753d2199908973b7deb32d Mon Sep 17 00:00:00 2001 -From: Grygorii Strashko <Grygorii.Strashko@linaro.org> -Date: Tue, 21 Jul 2015 19:43:56 +0300 -Subject: [PATCH 225/296] pid.h: include atomic.h -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -This patch fixes build error: - CC kernel/pid_namespace.o -In file included from kernel/pid_namespace.c:11:0: -include/linux/pid.h: In function 'get_pid': -include/linux/pid.h:78:3: error: implicit declaration of function 'atomic_inc' [-Werror=implicit-function-declaration] - atomic_inc(&pid->count); - ^ -which happens when - CONFIG_PROVE_LOCKING=n - CONFIG_DEBUG_SPINLOCK=n - CONFIG_DEBUG_MUTEXES=n - CONFIG_DEBUG_LOCK_ALLOC=n - CONFIG_PID_NS=y - -Vanilla gets this via spinlock.h. - -Signed-off-by: Grygorii Strashko <Grygorii.Strashko@linaro.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/pid.h | 1 + - 1 file changed, 1 insertion(+) - -diff --git a/include/linux/pid.h b/include/linux/pid.h -index fa10acb8d6a4..2f86f84e9fc1 100644 ---- a/include/linux/pid.h -+++ b/include/linux/pid.h -@@ -3,6 +3,7 @@ - #define _LINUX_PID_H - - #include <linux/rculist.h> -+#include <linux/atomic.h> - #include <linux/wait.h> - #include <linux/refcount.h> - --- -2.30.2 - diff --git a/debian/patches-rt/0226-ptrace-fix-ptrace-vs-tasklist_lock-race.patch b/debian/patches-rt/0226-ptrace-fix-ptrace-vs-tasklist_lock-race.patch deleted file mode 100644 index 839518ddf..000000000 --- a/debian/patches-rt/0226-ptrace-fix-ptrace-vs-tasklist_lock-race.patch +++ /dev/null @@ -1,157 +0,0 @@ -From 200a4b494c5dfcbb2713ad9a0c32a7d82951097f Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Thu, 29 Aug 2013 18:21:04 +0200 -Subject: [PATCH 226/296] ptrace: fix ptrace vs tasklist_lock race -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -As explained by Alexander Fyodorov <halcy@yandex.ru>: - -|read_lock(&tasklist_lock) in ptrace_stop() is converted to mutex on RT kernel, -|and it can remove __TASK_TRACED from task->state (by moving it to -|task->saved_state). If parent does wait() on child followed by a sys_ptrace -|call, the following race can happen: -| -|- child sets __TASK_TRACED in ptrace_stop() -|- parent does wait() which eventually calls wait_task_stopped() and returns -| child's pid -|- child blocks on read_lock(&tasklist_lock) in ptrace_stop() and moves -| __TASK_TRACED flag to saved_state -|- parent calls sys_ptrace, which calls ptrace_check_attach() and wait_task_inactive() - -The patch is based on his initial patch where an additional check is -added in case the __TASK_TRACED moved to ->saved_state. The pi_lock is -taken in case the caller is interrupted between looking into ->state and -->saved_state. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/sched.h | 49 +++++++++++++++++++++++++++++++++++++++---- - kernel/ptrace.c | 9 +++++++- - kernel/sched/core.c | 17 +++++++++++++-- - 3 files changed, 68 insertions(+), 7 deletions(-) - ---- a/include/linux/sched.h -+++ b/include/linux/sched.h -@@ -112,12 +112,8 @@ - __TASK_TRACED | EXIT_DEAD | EXIT_ZOMBIE | \ - TASK_PARKED) - --#define task_is_traced(task) ((task->state & __TASK_TRACED) != 0) -- - #define task_is_stopped(task) ((task->state & __TASK_STOPPED) != 0) - --#define task_is_stopped_or_traced(task) ((task->state & (__TASK_STOPPED | __TASK_TRACED)) != 0) -- - #ifdef CONFIG_DEBUG_ATOMIC_SLEEP - - /* -@@ -1876,6 +1872,51 @@ - return unlikely(test_tsk_thread_flag(tsk,TIF_NEED_RESCHED)); - } - -+static inline bool __task_is_stopped_or_traced(struct task_struct *task) -+{ -+ if (task->state & (__TASK_STOPPED | __TASK_TRACED)) -+ return true; -+#ifdef CONFIG_PREEMPT_RT -+ if (task->saved_state & (__TASK_STOPPED | __TASK_TRACED)) -+ return true; -+#endif -+ return false; -+} -+ -+static inline bool task_is_stopped_or_traced(struct task_struct *task) -+{ -+ bool traced_stopped; -+ -+#ifdef CONFIG_PREEMPT_RT -+ unsigned long flags; -+ -+ raw_spin_lock_irqsave(&task->pi_lock, flags); -+ traced_stopped = __task_is_stopped_or_traced(task); -+ raw_spin_unlock_irqrestore(&task->pi_lock, flags); -+#else -+ traced_stopped = __task_is_stopped_or_traced(task); -+#endif -+ return traced_stopped; -+} -+ -+static inline bool task_is_traced(struct task_struct *task) -+{ -+ bool traced = false; -+ -+ if (task->state & __TASK_TRACED) -+ return true; -+#ifdef CONFIG_PREEMPT_RT -+ /* in case the task is sleeping on tasklist_lock */ -+ raw_spin_lock_irq(&task->pi_lock); -+ if (task->state & __TASK_TRACED) -+ traced = true; -+ else if (task->saved_state & __TASK_TRACED) -+ traced = true; -+ raw_spin_unlock_irq(&task->pi_lock); -+#endif -+ return traced; -+} -+ - /* - * cond_resched() and cond_resched_lock(): latency reduction via - * explicit rescheduling in places that are safe. The return ---- a/kernel/ptrace.c -+++ b/kernel/ptrace.c -@@ -196,7 +196,14 @@ - spin_lock_irq(&task->sighand->siglock); - if (task_is_traced(task) && !looks_like_a_spurious_pid(task) && - !__fatal_signal_pending(task)) { -- task->state = __TASK_TRACED; -+ unsigned long flags; -+ -+ raw_spin_lock_irqsave(&task->pi_lock, flags); -+ if (task->state & __TASK_TRACED) -+ task->state = __TASK_TRACED; -+ else -+ task->saved_state = __TASK_TRACED; -+ raw_spin_unlock_irqrestore(&task->pi_lock, flags); - ret = true; - } - spin_unlock_irq(&task->sighand->siglock); ---- a/kernel/sched/core.c -+++ b/kernel/sched/core.c -@@ -2571,6 +2571,18 @@ - } - #endif /* CONFIG_NUMA_BALANCING */ - -+static bool check_task_state(struct task_struct *p, long match_state) -+{ -+ bool match = false; -+ -+ raw_spin_lock_irq(&p->pi_lock); -+ if (p->state == match_state || p->saved_state == match_state) -+ match = true; -+ raw_spin_unlock_irq(&p->pi_lock); -+ -+ return match; -+} -+ - /* - * wait_task_inactive - wait for a thread to unschedule. - * -@@ -2615,7 +2627,7 @@ - * is actually now running somewhere else! - */ - while (task_running(rq, p)) { -- if (match_state && unlikely(p->state != match_state)) -+ if (match_state && !check_task_state(p, match_state)) - return 0; - cpu_relax(); - } -@@ -2630,7 +2642,8 @@ - running = task_running(rq, p); - queued = task_on_rq_queued(p); - ncsw = 0; -- if (!match_state || p->state == match_state) -+ if (!match_state || p->state == match_state || -+ p->saved_state == match_state) - ncsw = p->nvcsw | LONG_MIN; /* sets MSB */ - task_rq_unlock(rq, p, &rf); - diff --git a/debian/patches-rt/0227-ptrace-fix-ptrace_unfreeze_traced-race-with-rt-lock.patch b/debian/patches-rt/0227-ptrace-fix-ptrace_unfreeze_traced-race-with-rt-lock.patch deleted file mode 100644 index b6e95fca6..000000000 --- a/debian/patches-rt/0227-ptrace-fix-ptrace_unfreeze_traced-race-with-rt-lock.patch +++ /dev/null @@ -1,65 +0,0 @@ -From 5977717d963550652a577e8e12252335b121a869 Mon Sep 17 00:00:00 2001 -From: Oleg Nesterov <oleg@redhat.com> -Date: Tue, 3 Nov 2020 12:39:01 +0100 -Subject: [PATCH 227/296] ptrace: fix ptrace_unfreeze_traced() race with - rt-lock -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The patch "ptrace: fix ptrace vs tasklist_lock race" changed -ptrace_freeze_traced() to take task->saved_state into account, but -ptrace_unfreeze_traced() has the same problem and needs a similar fix: -it should check/update both ->state and ->saved_state. - -Reported-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com> -Fixes: "ptrace: fix ptrace vs tasklist_lock race" -Signed-off-by: Oleg Nesterov <oleg@redhat.com> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Cc: stable-rt@vger.kernel.org ---- - kernel/ptrace.c | 23 +++++++++++++++-------- - 1 file changed, 15 insertions(+), 8 deletions(-) - -diff --git a/kernel/ptrace.c b/kernel/ptrace.c -index 0b81543c4050..2f09a6dbe140 100644 ---- a/kernel/ptrace.c -+++ b/kernel/ptrace.c -@@ -197,8 +197,8 @@ static bool ptrace_freeze_traced(struct task_struct *task) - - static void ptrace_unfreeze_traced(struct task_struct *task) - { -- if (task->state != __TASK_TRACED) -- return; -+ unsigned long flags; -+ bool frozen = true; - - WARN_ON(!task->ptrace || task->parent != current); - -@@ -207,12 +207,19 @@ static void ptrace_unfreeze_traced(struct task_struct *task) - * Recheck state under the lock to close this race. - */ - spin_lock_irq(&task->sighand->siglock); -- if (task->state == __TASK_TRACED) { -- if (__fatal_signal_pending(task)) -- wake_up_state(task, __TASK_TRACED); -- else -- task->state = TASK_TRACED; -- } -+ -+ raw_spin_lock_irqsave(&task->pi_lock, flags); -+ if (task->state == __TASK_TRACED) -+ task->state = TASK_TRACED; -+ else if (task->saved_state == __TASK_TRACED) -+ task->saved_state = TASK_TRACED; -+ else -+ frozen = false; -+ raw_spin_unlock_irqrestore(&task->pi_lock, flags); -+ -+ if (frozen && __fatal_signal_pending(task)) -+ wake_up_state(task, __TASK_TRACED); -+ - spin_unlock_irq(&task->sighand->siglock); - } - --- -2.30.2 - diff --git a/debian/patches-rt/0229-trace-Add-migrate-disabled-counter-to-tracing-output.patch b/debian/patches-rt/0229-trace-Add-migrate-disabled-counter-to-tracing-output.patch deleted file mode 100644 index 15867ae44..000000000 --- a/debian/patches-rt/0229-trace-Add-migrate-disabled-counter-to-tracing-output.patch +++ /dev/null @@ -1,123 +0,0 @@ -From 7e74c5c3df5fb9bcb3ff9e8e3b34bdc0409b4f36 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Sun, 17 Jul 2011 21:56:42 +0200 -Subject: [PATCH 229/296] trace: Add migrate-disabled counter to tracing output -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - include/linux/trace_events.h | 2 ++ - kernel/trace/trace.c | 26 +++++++++++++++++++------- - kernel/trace/trace_events.c | 1 + - kernel/trace/trace_output.c | 5 +++++ - 4 files changed, 27 insertions(+), 7 deletions(-) - -diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h -index 5d1eeac4bfbe..29c24ec33ffd 100644 ---- a/include/linux/trace_events.h -+++ b/include/linux/trace_events.h -@@ -67,6 +67,7 @@ struct trace_entry { - unsigned char flags; - unsigned char preempt_count; - int pid; -+ unsigned char migrate_disable; - }; - - #define TRACE_EVENT_TYPE_MAX \ -@@ -153,6 +154,7 @@ static inline void tracing_generic_entry_update(struct trace_entry *entry, - unsigned int trace_ctx) - { - entry->preempt_count = trace_ctx & 0xff; -+ entry->migrate_disable = (trace_ctx >> 8) & 0xff; - entry->pid = current->pid; - entry->type = type; - entry->flags = trace_ctx >> 16; -diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c -index ea5ed64017ec..255b3f027dfd 100644 ---- a/kernel/trace/trace.c -+++ b/kernel/trace/trace.c -@@ -2578,6 +2578,15 @@ enum print_line_t trace_handle_return(struct trace_seq *s) - } - EXPORT_SYMBOL_GPL(trace_handle_return); - -+static unsigned short migration_disable_value(void) -+{ -+#if defined(CONFIG_SMP) && defined(CONFIG_PREEMPT_RT) -+ return current->migration_disabled; -+#else -+ return 0; -+#endif -+} -+ - unsigned int tracing_gen_ctx_irq_test(unsigned int irqs_status) - { - unsigned int trace_flags = irqs_status; -@@ -2596,7 +2605,8 @@ unsigned int tracing_gen_ctx_irq_test(unsigned int irqs_status) - trace_flags |= TRACE_FLAG_NEED_RESCHED; - if (test_preempt_need_resched()) - trace_flags |= TRACE_FLAG_PREEMPT_RESCHED; -- return (trace_flags << 16) | (pc & 0xff); -+ return (trace_flags << 16) | (pc & 0xff) | -+ (migration_disable_value() & 0xff) << 8; - } - - struct ring_buffer_event * -@@ -3803,9 +3813,10 @@ static void print_lat_help_header(struct seq_file *m) - "# | / _----=> need-resched \n" - "# || / _---=> hardirq/softirq \n" - "# ||| / _--=> preempt-depth \n" -- "# |||| / delay \n" -- "# cmd pid ||||| time | caller \n" -- "# \\ / ||||| \\ | / \n"); -+ "# |||| / _-=> migrate-disable \n" -+ "# ||||| / delay \n" -+ "# cmd pid |||||| time | caller \n" -+ "# \\ / |||||| \\ | / \n"); - } - - static void print_event_info(struct array_buffer *buf, struct seq_file *m) -@@ -3843,9 +3854,10 @@ static void print_func_help_header_irq(struct array_buffer *buf, struct seq_file - seq_printf(m, "# %.*s / _----=> need-resched\n", prec, space); - seq_printf(m, "# %.*s| / _---=> hardirq/softirq\n", prec, space); - seq_printf(m, "# %.*s|| / _--=> preempt-depth\n", prec, space); -- seq_printf(m, "# %.*s||| / delay\n", prec, space); -- seq_printf(m, "# TASK-PID %.*s CPU# |||| TIMESTAMP FUNCTION\n", prec, " TGID "); -- seq_printf(m, "# | | %.*s | |||| | |\n", prec, " | "); -+ seq_printf(m, "# %.*s||| / _-=> migrate-disable\n", prec, space); -+ seq_printf(m, "# %.*s|||| / delay\n", prec, space); -+ seq_printf(m, "# TASK-PID %.*s CPU# ||||| TIMESTAMP FUNCTION\n", prec, " TGID "); -+ seq_printf(m, "# | | %.*s | ||||| | |\n", prec, " | "); - } - - void -diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c -index 546a535f1490..76e8981edac3 100644 ---- a/kernel/trace/trace_events.c -+++ b/kernel/trace/trace_events.c -@@ -183,6 +183,7 @@ static int trace_define_common_fields(void) - __common_field(unsigned char, flags); - __common_field(unsigned char, preempt_count); - __common_field(int, pid); -+ __common_field(unsigned char, migrate_disable); - - return ret; - } -diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c -index 000e9dc224c6..ad738192647b 100644 ---- a/kernel/trace/trace_output.c -+++ b/kernel/trace/trace_output.c -@@ -487,6 +487,11 @@ int trace_print_lat_fmt(struct trace_seq *s, struct trace_entry *entry) - else - trace_seq_putc(s, '.'); - -+ if (entry->migrate_disable) -+ trace_seq_printf(s, "%x", entry->migrate_disable); -+ else -+ trace_seq_putc(s, '.'); -+ - return !trace_seq_has_overflowed(s); - } - --- -2.30.2 - diff --git a/debian/patches-rt/0230-locking-don-t-check-for-__LINUX_SPINLOCK_TYPES_H-on-.patch b/debian/patches-rt/0230-locking-don-t-check-for-__LINUX_SPINLOCK_TYPES_H-on-.patch deleted file mode 100644 index cb58ab5d1..000000000 --- a/debian/patches-rt/0230-locking-don-t-check-for-__LINUX_SPINLOCK_TYPES_H-on-.patch +++ /dev/null @@ -1,166 +0,0 @@ -From 9a5e5476096644e8f525a8b112963a1db39c7001 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Fri, 4 Aug 2017 17:40:42 +0200 -Subject: [PATCH 230/296] locking: don't check for __LINUX_SPINLOCK_TYPES_H on - -RT archs -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Upstream uses arch_spinlock_t within spinlock_t and requests that -spinlock_types.h header file is included first. -On -RT we have the rt_mutex with its raw_lock wait_lock which needs -architectures' spinlock_types.h header file for its definition. However -we need rt_mutex first because it is used to build the spinlock_t so -that check does not work for us. -Therefore I am dropping that check. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/alpha/include/asm/spinlock_types.h | 4 ---- - arch/arm/include/asm/spinlock_types.h | 4 ---- - arch/arm64/include/asm/spinlock_types.h | 4 ---- - arch/hexagon/include/asm/spinlock_types.h | 4 ---- - arch/ia64/include/asm/spinlock_types.h | 4 ---- - arch/powerpc/include/asm/spinlock_types.h | 4 ---- - arch/s390/include/asm/spinlock_types.h | 4 ---- - arch/sh/include/asm/spinlock_types.h | 4 ---- - arch/xtensa/include/asm/spinlock_types.h | 4 ---- - 9 files changed, 36 deletions(-) - -diff --git a/arch/alpha/include/asm/spinlock_types.h b/arch/alpha/include/asm/spinlock_types.h -index 1d5716bc060b..6883bc952d22 100644 ---- a/arch/alpha/include/asm/spinlock_types.h -+++ b/arch/alpha/include/asm/spinlock_types.h -@@ -2,10 +2,6 @@ - #ifndef _ALPHA_SPINLOCK_TYPES_H - #define _ALPHA_SPINLOCK_TYPES_H - --#ifndef __LINUX_SPINLOCK_TYPES_H --# error "please don't include this file directly" --#endif -- - typedef struct { - volatile unsigned int lock; - } arch_spinlock_t; -diff --git a/arch/arm/include/asm/spinlock_types.h b/arch/arm/include/asm/spinlock_types.h -index 5976958647fe..a37c0803954b 100644 ---- a/arch/arm/include/asm/spinlock_types.h -+++ b/arch/arm/include/asm/spinlock_types.h -@@ -2,10 +2,6 @@ - #ifndef __ASM_SPINLOCK_TYPES_H - #define __ASM_SPINLOCK_TYPES_H - --#ifndef __LINUX_SPINLOCK_TYPES_H --# error "please don't include this file directly" --#endif -- - #define TICKET_SHIFT 16 - - typedef struct { -diff --git a/arch/arm64/include/asm/spinlock_types.h b/arch/arm64/include/asm/spinlock_types.h -index 18782f0c4721..6672b05350b4 100644 ---- a/arch/arm64/include/asm/spinlock_types.h -+++ b/arch/arm64/include/asm/spinlock_types.h -@@ -5,10 +5,6 @@ - #ifndef __ASM_SPINLOCK_TYPES_H - #define __ASM_SPINLOCK_TYPES_H - --#if !defined(__LINUX_SPINLOCK_TYPES_H) && !defined(__ASM_SPINLOCK_H) --# error "please don't include this file directly" --#endif -- - #include <asm-generic/qspinlock_types.h> - #include <asm-generic/qrwlock_types.h> - -diff --git a/arch/hexagon/include/asm/spinlock_types.h b/arch/hexagon/include/asm/spinlock_types.h -index 19d233497ba5..de72fb23016d 100644 ---- a/arch/hexagon/include/asm/spinlock_types.h -+++ b/arch/hexagon/include/asm/spinlock_types.h -@@ -8,10 +8,6 @@ - #ifndef _ASM_SPINLOCK_TYPES_H - #define _ASM_SPINLOCK_TYPES_H - --#ifndef __LINUX_SPINLOCK_TYPES_H --# error "please don't include this file directly" --#endif -- - typedef struct { - volatile unsigned int lock; - } arch_spinlock_t; -diff --git a/arch/ia64/include/asm/spinlock_types.h b/arch/ia64/include/asm/spinlock_types.h -index 6e345fefcdca..681408d6816f 100644 ---- a/arch/ia64/include/asm/spinlock_types.h -+++ b/arch/ia64/include/asm/spinlock_types.h -@@ -2,10 +2,6 @@ - #ifndef _ASM_IA64_SPINLOCK_TYPES_H - #define _ASM_IA64_SPINLOCK_TYPES_H - --#ifndef __LINUX_SPINLOCK_TYPES_H --# error "please don't include this file directly" --#endif -- - typedef struct { - volatile unsigned int lock; - } arch_spinlock_t; -diff --git a/arch/powerpc/include/asm/spinlock_types.h b/arch/powerpc/include/asm/spinlock_types.h -index c5d742f18021..cc6922a011ba 100644 ---- a/arch/powerpc/include/asm/spinlock_types.h -+++ b/arch/powerpc/include/asm/spinlock_types.h -@@ -2,10 +2,6 @@ - #ifndef _ASM_POWERPC_SPINLOCK_TYPES_H - #define _ASM_POWERPC_SPINLOCK_TYPES_H - --#ifndef __LINUX_SPINLOCK_TYPES_H --# error "please don't include this file directly" --#endif -- - #ifdef CONFIG_PPC_QUEUED_SPINLOCKS - #include <asm-generic/qspinlock_types.h> - #include <asm-generic/qrwlock_types.h> -diff --git a/arch/s390/include/asm/spinlock_types.h b/arch/s390/include/asm/spinlock_types.h -index cfed272e4fd5..8e28e8176ec8 100644 ---- a/arch/s390/include/asm/spinlock_types.h -+++ b/arch/s390/include/asm/spinlock_types.h -@@ -2,10 +2,6 @@ - #ifndef __ASM_SPINLOCK_TYPES_H - #define __ASM_SPINLOCK_TYPES_H - --#ifndef __LINUX_SPINLOCK_TYPES_H --# error "please don't include this file directly" --#endif -- - typedef struct { - int lock; - } __attribute__ ((aligned (4))) arch_spinlock_t; -diff --git a/arch/sh/include/asm/spinlock_types.h b/arch/sh/include/asm/spinlock_types.h -index e82369f286a2..22ca9a98bbb8 100644 ---- a/arch/sh/include/asm/spinlock_types.h -+++ b/arch/sh/include/asm/spinlock_types.h -@@ -2,10 +2,6 @@ - #ifndef __ASM_SH_SPINLOCK_TYPES_H - #define __ASM_SH_SPINLOCK_TYPES_H - --#ifndef __LINUX_SPINLOCK_TYPES_H --# error "please don't include this file directly" --#endif -- - typedef struct { - volatile unsigned int lock; - } arch_spinlock_t; -diff --git a/arch/xtensa/include/asm/spinlock_types.h b/arch/xtensa/include/asm/spinlock_types.h -index 64c9389254f1..dc846323b1cd 100644 ---- a/arch/xtensa/include/asm/spinlock_types.h -+++ b/arch/xtensa/include/asm/spinlock_types.h -@@ -2,10 +2,6 @@ - #ifndef __ASM_SPINLOCK_TYPES_H - #define __ASM_SPINLOCK_TYPES_H - --#if !defined(__LINUX_SPINLOCK_TYPES_H) && !defined(__ASM_SPINLOCK_H) --# error "please don't include this file directly" --#endif -- - #include <asm-generic/qspinlock_types.h> - #include <asm-generic/qrwlock_types.h> - --- -2.30.2 - diff --git a/debian/patches-rt/0231-locking-Make-spinlock_t-and-rwlock_t-a-RCU-section-o.patch b/debian/patches-rt/0231-locking-Make-spinlock_t-and-rwlock_t-a-RCU-section-o.patch deleted file mode 100644 index 3e9cc98e4..000000000 --- a/debian/patches-rt/0231-locking-Make-spinlock_t-and-rwlock_t-a-RCU-section-o.patch +++ /dev/null @@ -1,126 +0,0 @@ -From 880534e623004b2ac4bcb80f205ae11205939290 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Tue, 19 Nov 2019 09:25:04 +0100 -Subject: [PATCH 231/296] locking: Make spinlock_t and rwlock_t a RCU section - on RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -On !RT a locked spinlock_t and rwlock_t disables preemption which -implies a RCU read section. There is code that relies on that behaviour. - -Add an explicit RCU read section on RT while a sleeping lock (a lock -which would disables preemption on !RT) acquired. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/locking/rtmutex.c | 6 ++++++ - kernel/locking/rwlock-rt.c | 6 ++++++ - 2 files changed, 12 insertions(+) - -diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c -index 4e8f73d9319d..4ea87d6c9ab7 100644 ---- a/kernel/locking/rtmutex.c -+++ b/kernel/locking/rtmutex.c -@@ -1136,6 +1136,7 @@ void __lockfunc rt_spin_lock(spinlock_t *lock) - { - spin_acquire(&lock->dep_map, 0, 0, _RET_IP_); - rt_spin_lock_fastlock(&lock->lock, rt_spin_lock_slowlock); -+ rcu_read_lock(); - migrate_disable(); - } - EXPORT_SYMBOL(rt_spin_lock); -@@ -1150,6 +1151,7 @@ void __lockfunc rt_spin_lock_nested(spinlock_t *lock, int subclass) - { - spin_acquire(&lock->dep_map, subclass, 0, _RET_IP_); - rt_spin_lock_fastlock(&lock->lock, rt_spin_lock_slowlock); -+ rcu_read_lock(); - migrate_disable(); - } - EXPORT_SYMBOL(rt_spin_lock_nested); -@@ -1159,6 +1161,7 @@ void __lockfunc rt_spin_lock_nest_lock(spinlock_t *lock, - { - spin_acquire_nest(&lock->dep_map, 0, 0, nest_lock, _RET_IP_); - rt_spin_lock_fastlock(&lock->lock, rt_spin_lock_slowlock); -+ rcu_read_lock(); - migrate_disable(); - } - EXPORT_SYMBOL(rt_spin_lock_nest_lock); -@@ -1169,6 +1172,7 @@ void __lockfunc rt_spin_unlock(spinlock_t *lock) - /* NOTE: we always pass in '1' for nested, for simplicity */ - spin_release(&lock->dep_map, _RET_IP_); - migrate_enable(); -+ rcu_read_unlock(); - rt_spin_lock_fastunlock(&lock->lock, rt_spin_lock_slowunlock); - } - EXPORT_SYMBOL(rt_spin_unlock); -@@ -1198,6 +1202,7 @@ int __lockfunc rt_spin_trylock(spinlock_t *lock) - ret = __rt_mutex_trylock(&lock->lock); - if (ret) { - spin_acquire(&lock->dep_map, 0, 1, _RET_IP_); -+ rcu_read_lock(); - migrate_disable(); - } - return ret; -@@ -1212,6 +1217,7 @@ int __lockfunc rt_spin_trylock_bh(spinlock_t *lock) - ret = __rt_mutex_trylock(&lock->lock); - if (ret) { - spin_acquire(&lock->dep_map, 0, 1, _RET_IP_); -+ rcu_read_lock(); - migrate_disable(); - } else { - local_bh_enable(); -diff --git a/kernel/locking/rwlock-rt.c b/kernel/locking/rwlock-rt.c -index 16be7111aae7..3d2d1f14b513 100644 ---- a/kernel/locking/rwlock-rt.c -+++ b/kernel/locking/rwlock-rt.c -@@ -270,6 +270,7 @@ int __lockfunc rt_read_trylock(rwlock_t *rwlock) - ret = __read_rt_trylock(rwlock); - if (ret) { - rwlock_acquire_read(&rwlock->dep_map, 0, 1, _RET_IP_); -+ rcu_read_lock(); - migrate_disable(); - } - return ret; -@@ -283,6 +284,7 @@ int __lockfunc rt_write_trylock(rwlock_t *rwlock) - ret = __write_rt_trylock(rwlock); - if (ret) { - rwlock_acquire(&rwlock->dep_map, 0, 1, _RET_IP_); -+ rcu_read_lock(); - migrate_disable(); - } - return ret; -@@ -293,6 +295,7 @@ void __lockfunc rt_read_lock(rwlock_t *rwlock) - { - rwlock_acquire_read(&rwlock->dep_map, 0, 0, _RET_IP_); - __read_rt_lock(rwlock); -+ rcu_read_lock(); - migrate_disable(); - } - EXPORT_SYMBOL(rt_read_lock); -@@ -301,6 +304,7 @@ void __lockfunc rt_write_lock(rwlock_t *rwlock) - { - rwlock_acquire(&rwlock->dep_map, 0, 0, _RET_IP_); - __write_rt_lock(rwlock); -+ rcu_read_lock(); - migrate_disable(); - } - EXPORT_SYMBOL(rt_write_lock); -@@ -309,6 +313,7 @@ void __lockfunc rt_read_unlock(rwlock_t *rwlock) - { - rwlock_release(&rwlock->dep_map, _RET_IP_); - migrate_enable(); -+ rcu_read_unlock(); - __read_rt_unlock(rwlock); - } - EXPORT_SYMBOL(rt_read_unlock); -@@ -317,6 +322,7 @@ void __lockfunc rt_write_unlock(rwlock_t *rwlock) - { - rwlock_release(&rwlock->dep_map, _RET_IP_); - migrate_enable(); -+ rcu_read_unlock(); - __write_rt_unlock(rwlock); - } - EXPORT_SYMBOL(rt_write_unlock); --- -2.30.2 - diff --git a/debian/patches-rt/0232-rcutorture-Avoid-problematic-critical-section-nestin.patch b/debian/patches-rt/0232-rcutorture-Avoid-problematic-critical-section-nestin.patch deleted file mode 100644 index 6783d569a..000000000 --- a/debian/patches-rt/0232-rcutorture-Avoid-problematic-critical-section-nestin.patch +++ /dev/null @@ -1,196 +0,0 @@ -From b6fa508d144528f87363397b6ef32baf065b259c Mon Sep 17 00:00:00 2001 -From: Scott Wood <swood@redhat.com> -Date: Wed, 11 Sep 2019 17:57:29 +0100 -Subject: [PATCH 232/296] rcutorture: Avoid problematic critical section - nesting on RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -rcutorture was generating some nesting scenarios that are not -reasonable. Constrain the state selection to avoid them. - -Example #1: - -1. preempt_disable() -2. local_bh_disable() -3. preempt_enable() -4. local_bh_enable() - -On PREEMPT_RT, BH disabling takes a local lock only when called in -non-atomic context. Thus, atomic context must be retained until after BH -is re-enabled. Likewise, if BH is initially disabled in non-atomic -context, it cannot be re-enabled in atomic context. - -Example #2: - -1. rcu_read_lock() -2. local_irq_disable() -3. rcu_read_unlock() -4. local_irq_enable() - -If the thread is preempted between steps 1 and 2, -rcu_read_unlock_special.b.blocked will be set, but it won't be -acted on in step 3 because IRQs are disabled. Thus, reporting of the -quiescent state will be delayed beyond the local_irq_enable(). - -For now, these scenarios will continue to be tested on non-PREEMPT_RT -kernels, until debug checks are added to ensure that they are not -happening elsewhere. - -Signed-off-by: Scott Wood <swood@redhat.com> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/rcu/rcutorture.c | 97 +++++++++++++++++++++++++++++++++++------ - 1 file changed, 83 insertions(+), 14 deletions(-) - -diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c -index 916ea4f66e4b..eb089d3035d5 100644 ---- a/kernel/rcu/rcutorture.c -+++ b/kernel/rcu/rcutorture.c -@@ -61,10 +61,13 @@ MODULE_AUTHOR("Paul E. McKenney <paulmck@linux.ibm.com> and Josh Triplett <josh@ - #define RCUTORTURE_RDR_RBH 0x08 /* ... rcu_read_lock_bh(). */ - #define RCUTORTURE_RDR_SCHED 0x10 /* ... rcu_read_lock_sched(). */ - #define RCUTORTURE_RDR_RCU 0x20 /* ... entering another RCU reader. */ --#define RCUTORTURE_RDR_NBITS 6 /* Number of bits defined above. */ -+#define RCUTORTURE_RDR_ATOM_BH 0x40 /* ... disabling bh while atomic */ -+#define RCUTORTURE_RDR_ATOM_RBH 0x80 /* ... RBH while atomic */ -+#define RCUTORTURE_RDR_NBITS 8 /* Number of bits defined above. */ - #define RCUTORTURE_MAX_EXTEND \ - (RCUTORTURE_RDR_BH | RCUTORTURE_RDR_IRQ | RCUTORTURE_RDR_PREEMPT | \ -- RCUTORTURE_RDR_RBH | RCUTORTURE_RDR_SCHED) -+ RCUTORTURE_RDR_RBH | RCUTORTURE_RDR_SCHED | \ -+ RCUTORTURE_RDR_ATOM_BH | RCUTORTURE_RDR_ATOM_RBH) - #define RCUTORTURE_RDR_MAX_LOOPS 0x7 /* Maximum reader extensions. */ - /* Must be power of two minus one. */ - #define RCUTORTURE_RDR_MAX_SEGS (RCUTORTURE_RDR_MAX_LOOPS + 3) -@@ -1235,31 +1238,53 @@ static void rcutorture_one_extend(int *readstate, int newstate, - WARN_ON_ONCE((idxold >> RCUTORTURE_RDR_SHIFT) > 1); - rtrsp->rt_readstate = newstate; - -- /* First, put new protection in place to avoid critical-section gap. */ -+ /* -+ * First, put new protection in place to avoid critical-section gap. -+ * Disable preemption around the ATOM disables to ensure that -+ * in_atomic() is true. -+ */ - if (statesnew & RCUTORTURE_RDR_BH) - local_bh_disable(); -+ if (statesnew & RCUTORTURE_RDR_RBH) -+ rcu_read_lock_bh(); - if (statesnew & RCUTORTURE_RDR_IRQ) - local_irq_disable(); - if (statesnew & RCUTORTURE_RDR_PREEMPT) - preempt_disable(); -- if (statesnew & RCUTORTURE_RDR_RBH) -- rcu_read_lock_bh(); - if (statesnew & RCUTORTURE_RDR_SCHED) - rcu_read_lock_sched(); -+ preempt_disable(); -+ if (statesnew & RCUTORTURE_RDR_ATOM_BH) -+ local_bh_disable(); -+ if (statesnew & RCUTORTURE_RDR_ATOM_RBH) -+ rcu_read_lock_bh(); -+ preempt_enable(); - if (statesnew & RCUTORTURE_RDR_RCU) - idxnew = cur_ops->readlock() << RCUTORTURE_RDR_SHIFT; - -- /* Next, remove old protection, irq first due to bh conflict. */ -+ /* -+ * Next, remove old protection, in decreasing order of strength -+ * to avoid unlock paths that aren't safe in the stronger -+ * context. Disable preemption around the ATOM enables in -+ * case the context was only atomic due to IRQ disabling. -+ */ -+ preempt_disable(); - if (statesold & RCUTORTURE_RDR_IRQ) - local_irq_enable(); -- if (statesold & RCUTORTURE_RDR_BH) -+ if (statesold & RCUTORTURE_RDR_ATOM_BH) - local_bh_enable(); -+ if (statesold & RCUTORTURE_RDR_ATOM_RBH) -+ rcu_read_unlock_bh(); -+ preempt_enable(); - if (statesold & RCUTORTURE_RDR_PREEMPT) - preempt_enable(); -- if (statesold & RCUTORTURE_RDR_RBH) -- rcu_read_unlock_bh(); - if (statesold & RCUTORTURE_RDR_SCHED) - rcu_read_unlock_sched(); -+ if (statesold & RCUTORTURE_RDR_BH) -+ local_bh_enable(); -+ if (statesold & RCUTORTURE_RDR_RBH) -+ rcu_read_unlock_bh(); -+ - if (statesold & RCUTORTURE_RDR_RCU) { - bool lockit = !statesnew && !(torture_random(trsp) & 0xffff); - -@@ -1302,6 +1327,12 @@ rcutorture_extend_mask(int oldmask, struct torture_random_state *trsp) - int mask = rcutorture_extend_mask_max(); - unsigned long randmask1 = torture_random(trsp) >> 8; - unsigned long randmask2 = randmask1 >> 3; -+ unsigned long preempts = RCUTORTURE_RDR_PREEMPT | RCUTORTURE_RDR_SCHED; -+ unsigned long preempts_irq = preempts | RCUTORTURE_RDR_IRQ; -+ unsigned long nonatomic_bhs = RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH; -+ unsigned long atomic_bhs = RCUTORTURE_RDR_ATOM_BH | -+ RCUTORTURE_RDR_ATOM_RBH; -+ unsigned long tmp; - - WARN_ON_ONCE(mask >> RCUTORTURE_RDR_SHIFT); - /* Mostly only one bit (need preemption!), sometimes lots of bits. */ -@@ -1309,11 +1340,49 @@ rcutorture_extend_mask(int oldmask, struct torture_random_state *trsp) - mask = mask & randmask2; - else - mask = mask & (1 << (randmask2 % RCUTORTURE_RDR_NBITS)); -- /* Can't enable bh w/irq disabled. */ -- if ((mask & RCUTORTURE_RDR_IRQ) && -- ((!(mask & RCUTORTURE_RDR_BH) && (oldmask & RCUTORTURE_RDR_BH)) || -- (!(mask & RCUTORTURE_RDR_RBH) && (oldmask & RCUTORTURE_RDR_RBH)))) -- mask |= RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH; -+ -+ /* -+ * Can't enable bh w/irq disabled. -+ */ -+ tmp = atomic_bhs | nonatomic_bhs; -+ if (mask & RCUTORTURE_RDR_IRQ) -+ mask |= oldmask & tmp; -+ -+ /* -+ * Ideally these sequences would be detected in debug builds -+ * (regardless of RT), but until then don't stop testing -+ * them on non-RT. -+ */ -+ if (IS_ENABLED(CONFIG_PREEMPT_RT)) { -+ /* -+ * Can't release the outermost rcu lock in an irq disabled -+ * section without preemption also being disabled, if irqs -+ * had ever been enabled during this RCU critical section -+ * (could leak a special flag and delay reporting the qs). -+ */ -+ if ((oldmask & RCUTORTURE_RDR_RCU) && -+ (mask & RCUTORTURE_RDR_IRQ) && -+ !(mask & preempts)) -+ mask |= RCUTORTURE_RDR_RCU; -+ -+ /* Can't modify atomic bh in non-atomic context */ -+ if ((oldmask & atomic_bhs) && (mask & atomic_bhs) && -+ !(mask & preempts_irq)) { -+ mask |= oldmask & preempts_irq; -+ if (mask & RCUTORTURE_RDR_IRQ) -+ mask |= oldmask & tmp; -+ } -+ if ((mask & atomic_bhs) && !(mask & preempts_irq)) -+ mask |= RCUTORTURE_RDR_PREEMPT; -+ -+ /* Can't modify non-atomic bh in atomic context */ -+ tmp = nonatomic_bhs; -+ if (oldmask & preempts_irq) -+ mask &= ~tmp; -+ if ((oldmask | mask) & preempts_irq) -+ mask |= oldmask & tmp; -+ } -+ - return mask ?: RCUTORTURE_RDR_RCU; - } - --- -2.30.2 - diff --git a/debian/patches-rt/0233-mm-vmalloc-Another-preempt-disable-region-which-suck.patch b/debian/patches-rt/0233-mm-vmalloc-Another-preempt-disable-region-which-suck.patch deleted file mode 100644 index 44a18448f..000000000 --- a/debian/patches-rt/0233-mm-vmalloc-Another-preempt-disable-region-which-suck.patch +++ /dev/null @@ -1,73 +0,0 @@ -From 80953e63313bfe698c24fa8a8a90a9f34c0e1f21 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 12 Jul 2011 11:39:36 +0200 -Subject: [PATCH 233/296] mm/vmalloc: Another preempt disable region which - sucks -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Avoid the preempt disable version of get_cpu_var(). The inner-lock should -provide enough serialisation. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - mm/vmalloc.c | 13 ++++++++----- - 1 file changed, 8 insertions(+), 5 deletions(-) - -diff --git a/mm/vmalloc.c b/mm/vmalloc.c -index fff03a331314..2cbbbdb69ec4 100644 ---- a/mm/vmalloc.c -+++ b/mm/vmalloc.c -@@ -1542,7 +1542,7 @@ static void *new_vmap_block(unsigned int order, gfp_t gfp_mask) - struct vmap_block *vb; - struct vmap_area *va; - unsigned long vb_idx; -- int node, err; -+ int node, err, cpu; - void *vaddr; - - node = numa_node_id(); -@@ -1579,11 +1579,12 @@ static void *new_vmap_block(unsigned int order, gfp_t gfp_mask) - return ERR_PTR(err); - } - -- vbq = &get_cpu_var(vmap_block_queue); -+ cpu = get_cpu_light(); -+ vbq = this_cpu_ptr(&vmap_block_queue); - spin_lock(&vbq->lock); - list_add_tail_rcu(&vb->free_list, &vbq->free); - spin_unlock(&vbq->lock); -- put_cpu_var(vmap_block_queue); -+ put_cpu_light(); - - return vaddr; - } -@@ -1648,6 +1649,7 @@ static void *vb_alloc(unsigned long size, gfp_t gfp_mask) - struct vmap_block *vb; - void *vaddr = NULL; - unsigned int order; -+ int cpu; - - BUG_ON(offset_in_page(size)); - BUG_ON(size > PAGE_SIZE*VMAP_MAX_ALLOC); -@@ -1662,7 +1664,8 @@ static void *vb_alloc(unsigned long size, gfp_t gfp_mask) - order = get_order(size); - - rcu_read_lock(); -- vbq = &get_cpu_var(vmap_block_queue); -+ cpu = get_cpu_light(); -+ vbq = this_cpu_ptr(&vmap_block_queue); - list_for_each_entry_rcu(vb, &vbq->free, free_list) { - unsigned long pages_off; - -@@ -1685,7 +1688,7 @@ static void *vb_alloc(unsigned long size, gfp_t gfp_mask) - break; - } - -- put_cpu_var(vmap_block_queue); -+ put_cpu_light(); - rcu_read_unlock(); - - /* Allocate new block if nothing was found */ --- -2.30.2 - diff --git a/debian/patches-rt/0238-rt-Introduce-cpu_chill.patch b/debian/patches-rt/0238-rt-Introduce-cpu_chill.patch deleted file mode 100644 index 93c0111cb..000000000 --- a/debian/patches-rt/0238-rt-Introduce-cpu_chill.patch +++ /dev/null @@ -1,122 +0,0 @@ -From 7850ec3df9b7b3b169a01ab8e5e660d99da6f89c Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Wed, 7 Mar 2012 20:51:03 +0100 -Subject: [PATCH 238/296] rt: Introduce cpu_chill() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Retry loops on RT might loop forever when the modifying side was -preempted. Add cpu_chill() to replace cpu_relax(). cpu_chill() -defaults to cpu_relax() for non RT. On RT it puts the looping task to -sleep for a tick so the preempted task can make progress. - -Steven Rostedt changed it to use a hrtimer instead of msleep(): -| -|Ulrich Obergfell pointed out that cpu_chill() calls msleep() which is woken -|up by the ksoftirqd running the TIMER softirq. But as the cpu_chill() is -|called from softirq context, it may block the ksoftirqd() from running, in -|which case, it may never wake up the msleep() causing the deadlock. - -+ bigeasy later changed to schedule_hrtimeout() -|If a task calls cpu_chill() and gets woken up by a regular or spurious -|wakeup and has a signal pending, then it exits the sleep loop in -|do_nanosleep() and sets up the restart block. If restart->nanosleep.type is -|not TI_NONE then this results in accessing a stale user pointer from a -|previously interrupted syscall and a copy to user based on the stale -|pointer or a BUG() when 'type' is not supported in nanosleep_copyout(). - -+ bigeasy: add PF_NOFREEZE: -| [....] Waiting for /dev to be fully populated... -| ===================================== -| [ BUG: udevd/229 still has locks held! ] -| 3.12.11-rt17 #23 Not tainted -| ------------------------------------- -| 1 lock held by udevd/229: -| #0: (&type->i_mutex_dir_key#2){+.+.+.}, at: lookup_slow+0x28/0x98 -| -| stack backtrace: -| CPU: 0 PID: 229 Comm: udevd Not tainted 3.12.11-rt17 #23 -| (unwind_backtrace+0x0/0xf8) from (show_stack+0x10/0x14) -| (show_stack+0x10/0x14) from (dump_stack+0x74/0xbc) -| (dump_stack+0x74/0xbc) from (do_nanosleep+0x120/0x160) -| (do_nanosleep+0x120/0x160) from (hrtimer_nanosleep+0x90/0x110) -| (hrtimer_nanosleep+0x90/0x110) from (cpu_chill+0x30/0x38) -| (cpu_chill+0x30/0x38) from (dentry_kill+0x158/0x1ec) -| (dentry_kill+0x158/0x1ec) from (dput+0x74/0x15c) -| (dput+0x74/0x15c) from (lookup_real+0x4c/0x50) -| (lookup_real+0x4c/0x50) from (__lookup_hash+0x34/0x44) -| (__lookup_hash+0x34/0x44) from (lookup_slow+0x38/0x98) -| (lookup_slow+0x38/0x98) from (path_lookupat+0x208/0x7fc) -| (path_lookupat+0x208/0x7fc) from (filename_lookup+0x20/0x60) -| (filename_lookup+0x20/0x60) from (user_path_at_empty+0x50/0x7c) -| (user_path_at_empty+0x50/0x7c) from (user_path_at+0x14/0x1c) -| (user_path_at+0x14/0x1c) from (vfs_fstatat+0x48/0x94) -| (vfs_fstatat+0x48/0x94) from (SyS_stat64+0x14/0x30) -| (SyS_stat64+0x14/0x30) from (ret_fast_syscall+0x0/0x48) - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -Signed-off-by: Steven Rostedt <rostedt@goodmis.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/delay.h | 6 ++++++ - kernel/time/hrtimer.c | 30 ++++++++++++++++++++++++++++++ - 2 files changed, 36 insertions(+) - -diff --git a/include/linux/delay.h b/include/linux/delay.h -index 1d0e2ce6b6d9..02b37178b54f 100644 ---- a/include/linux/delay.h -+++ b/include/linux/delay.h -@@ -76,4 +76,10 @@ static inline void fsleep(unsigned long usecs) - msleep(DIV_ROUND_UP(usecs, 1000)); - } - -+#ifdef CONFIG_PREEMPT_RT -+extern void cpu_chill(void); -+#else -+# define cpu_chill() cpu_relax() -+#endif -+ - #endif /* defined(_LINUX_DELAY_H) */ -diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c -index 9505b1f21cdf..1256f2ce948b 100644 ---- a/kernel/time/hrtimer.c -+++ b/kernel/time/hrtimer.c -@@ -2006,6 +2006,36 @@ SYSCALL_DEFINE2(nanosleep_time32, struct old_timespec32 __user *, rqtp, - } - #endif - -+#ifdef CONFIG_PREEMPT_RT -+/* -+ * Sleep for 1 ms in hope whoever holds what we want will let it go. -+ */ -+void cpu_chill(void) -+{ -+ unsigned int freeze_flag = current->flags & PF_NOFREEZE; -+ struct task_struct *self = current; -+ ktime_t chill_time; -+ -+ raw_spin_lock_irq(&self->pi_lock); -+ self->saved_state = self->state; -+ __set_current_state_no_track(TASK_UNINTERRUPTIBLE); -+ raw_spin_unlock_irq(&self->pi_lock); -+ -+ chill_time = ktime_set(0, NSEC_PER_MSEC); -+ -+ current->flags |= PF_NOFREEZE; -+ schedule_hrtimeout(&chill_time, HRTIMER_MODE_REL_HARD); -+ if (!freeze_flag) -+ current->flags &= ~PF_NOFREEZE; -+ -+ raw_spin_lock_irq(&self->pi_lock); -+ __set_current_state_no_track(self->saved_state); -+ self->saved_state = TASK_RUNNING; -+ raw_spin_unlock_irq(&self->pi_lock); -+} -+EXPORT_SYMBOL(cpu_chill); -+#endif -+ - /* - * Functions related to boot-time initialization: - */ --- -2.30.2 - diff --git a/debian/patches-rt/0239-fs-namespace-Use-cpu_chill-in-trylock-loops.patch b/debian/patches-rt/0239-fs-namespace-Use-cpu_chill-in-trylock-loops.patch deleted file mode 100644 index 6a01f000b..000000000 --- a/debian/patches-rt/0239-fs-namespace-Use-cpu_chill-in-trylock-loops.patch +++ /dev/null @@ -1,44 +0,0 @@ -From 80726815fd0962cb9eafb7c75a30356562ff9809 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Wed, 7 Mar 2012 21:00:34 +0100 -Subject: [PATCH 239/296] fs: namespace: Use cpu_chill() in trylock loops -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Retry loops on RT might loop forever when the modifying side was -preempted. Use cpu_chill() instead of cpu_relax() to let the system -make progress. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - fs/namespace.c | 8 ++++++-- - 1 file changed, 6 insertions(+), 2 deletions(-) - -diff --git a/fs/namespace.c b/fs/namespace.c -index c7fbb50a5aaa..5e261e50625a 100644 ---- a/fs/namespace.c -+++ b/fs/namespace.c -@@ -14,6 +14,7 @@ - #include <linux/mnt_namespace.h> - #include <linux/user_namespace.h> - #include <linux/namei.h> -+#include <linux/delay.h> - #include <linux/security.h> - #include <linux/cred.h> - #include <linux/idr.h> -@@ -321,8 +322,11 @@ int __mnt_want_write(struct vfsmount *m) - * incremented count after it has set MNT_WRITE_HOLD. - */ - smp_mb(); -- while (READ_ONCE(mnt->mnt.mnt_flags) & MNT_WRITE_HOLD) -- cpu_relax(); -+ while (READ_ONCE(mnt->mnt.mnt_flags) & MNT_WRITE_HOLD) { -+ preempt_enable(); -+ cpu_chill(); -+ preempt_disable(); -+ } - /* - * After the slowpath clears MNT_WRITE_HOLD, mnt_is_readonly will - * be set to match its requirements. So we must not load that until --- -2.30.2 - diff --git a/debian/patches-rt/0240-debugobjects-Make-RT-aware.patch b/debian/patches-rt/0240-debugobjects-Make-RT-aware.patch deleted file mode 100644 index 348a3bce9..000000000 --- a/debian/patches-rt/0240-debugobjects-Make-RT-aware.patch +++ /dev/null @@ -1,32 +0,0 @@ -From 973564311384a73897551293251c27b0bd7b12c2 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Sun, 17 Jul 2011 21:41:35 +0200 -Subject: [PATCH 240/296] debugobjects: Make RT aware -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Avoid filling the pool / allocating memory with irqs off(). - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - lib/debugobjects.c | 5 ++++- - 1 file changed, 4 insertions(+), 1 deletion(-) - -diff --git a/lib/debugobjects.c b/lib/debugobjects.c -index 9e14ae02306b..083882a3cf2f 100644 ---- a/lib/debugobjects.c -+++ b/lib/debugobjects.c -@@ -557,7 +557,10 @@ __debug_object_init(void *addr, const struct debug_obj_descr *descr, int onstack - struct debug_obj *obj; - unsigned long flags; - -- fill_pool(); -+#ifdef CONFIG_PREEMPT_RT -+ if (preempt_count() == 0 && !irqs_disabled()) -+#endif -+ fill_pool(); - - db = get_bucket((unsigned long) addr); - --- -2.30.2 - diff --git a/debian/patches-rt/0244-irqwork-push-most-work-into-softirq-context.patch b/debian/patches-rt/0244-irqwork-push-most-work-into-softirq-context.patch deleted file mode 100644 index 574d28346..000000000 --- a/debian/patches-rt/0244-irqwork-push-most-work-into-softirq-context.patch +++ /dev/null @@ -1,189 +0,0 @@ -From b45a276a45e03c5aba631f98600fb9e7afffa1c6 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Tue, 23 Jun 2015 15:32:51 +0200 -Subject: [PATCH 244/296] irqwork: push most work into softirq context -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Initially we defered all irqwork into softirq because we didn't want the -latency spikes if perf or another user was busy and delayed the RT task. -The NOHZ trigger (nohz_full_kick_work) was the first user that did not work -as expected if it did not run in the original irqwork context so we had to -bring it back somehow for it. push_irq_work_func is the second one that -requires this. - -This patch adds the IRQ_WORK_HARD_IRQ which makes sure the callback runs -in raw-irq context. Everything else is defered into softirq context. Without --RT we have the orignal behavior. - -This patch incorporates tglx orignal work which revoked a little bringing back -the arch_irq_work_raise() if possible and a few fixes from Steven Rostedt and -Mike Galbraith, - -[bigeasy: melt tglx's irq_work_tick_soft() which splits irq_work_tick() into a - hard and soft variant] -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/irq_work.h | 6 +++++ - kernel/irq_work.c | 58 +++++++++++++++++++++++++++++++--------- - kernel/sched/topology.c | 1 + - kernel/time/timer.c | 2 ++ - 4 files changed, 55 insertions(+), 12 deletions(-) - -diff --git a/include/linux/irq_work.h b/include/linux/irq_work.h -index 30823780c192..f941f2d7d71c 100644 ---- a/include/linux/irq_work.h -+++ b/include/linux/irq_work.h -@@ -55,4 +55,10 @@ static inline void irq_work_run(void) { } - static inline void irq_work_single(void *arg) { } - #endif - -+#if defined(CONFIG_IRQ_WORK) && defined(CONFIG_PREEMPT_RT) -+void irq_work_tick_soft(void); -+#else -+static inline void irq_work_tick_soft(void) { } -+#endif -+ - #endif /* _LINUX_IRQ_WORK_H */ -diff --git a/kernel/irq_work.c b/kernel/irq_work.c -index eca83965b631..8183d30e1bb1 100644 ---- a/kernel/irq_work.c -+++ b/kernel/irq_work.c -@@ -18,6 +18,7 @@ - #include <linux/cpu.h> - #include <linux/notifier.h> - #include <linux/smp.h> -+#include <linux/interrupt.h> - #include <asm/processor.h> - - -@@ -52,13 +53,19 @@ void __weak arch_irq_work_raise(void) - /* Enqueue on current CPU, work must already be claimed and preempt disabled */ - static void __irq_work_queue_local(struct irq_work *work) - { -+ struct llist_head *list; -+ bool lazy_work, realtime = IS_ENABLED(CONFIG_PREEMPT_RT); -+ -+ lazy_work = atomic_read(&work->flags) & IRQ_WORK_LAZY; -+ - /* If the work is "lazy", handle it from next tick if any */ -- if (atomic_read(&work->flags) & IRQ_WORK_LAZY) { -- if (llist_add(&work->llnode, this_cpu_ptr(&lazy_list)) && -- tick_nohz_tick_stopped()) -- arch_irq_work_raise(); -- } else { -- if (llist_add(&work->llnode, this_cpu_ptr(&raised_list))) -+ if (lazy_work || (realtime && !(atomic_read(&work->flags) & IRQ_WORK_HARD_IRQ))) -+ list = this_cpu_ptr(&lazy_list); -+ else -+ list = this_cpu_ptr(&raised_list); -+ -+ if (llist_add(&work->llnode, list)) { -+ if (!lazy_work || tick_nohz_tick_stopped()) - arch_irq_work_raise(); - } - } -@@ -102,7 +109,13 @@ bool irq_work_queue_on(struct irq_work *work, int cpu) - if (cpu != smp_processor_id()) { - /* Arch remote IPI send/receive backend aren't NMI safe */ - WARN_ON_ONCE(in_nmi()); -- __smp_call_single_queue(cpu, &work->llnode); -+ -+ if (IS_ENABLED(CONFIG_PREEMPT_RT) && !(atomic_read(&work->flags) & IRQ_WORK_HARD_IRQ)) { -+ if (llist_add(&work->llnode, &per_cpu(lazy_list, cpu))) -+ arch_send_call_function_single_ipi(cpu); -+ } else { -+ __smp_call_single_queue(cpu, &work->llnode); -+ } - } else { - __irq_work_queue_local(work); - } -@@ -120,9 +133,8 @@ bool irq_work_needs_cpu(void) - raised = this_cpu_ptr(&raised_list); - lazy = this_cpu_ptr(&lazy_list); - -- if (llist_empty(raised) || arch_irq_work_has_interrupt()) -- if (llist_empty(lazy)) -- return false; -+ if (llist_empty(raised) && llist_empty(lazy)) -+ return false; - - /* All work should have been flushed before going offline */ - WARN_ON_ONCE(cpu_is_offline(smp_processor_id())); -@@ -160,8 +172,12 @@ static void irq_work_run_list(struct llist_head *list) - struct irq_work *work, *tmp; - struct llist_node *llnode; - -+#ifndef CONFIG_PREEMPT_RT -+ /* -+ * nort: On RT IRQ-work may run in SOFTIRQ context. -+ */ - BUG_ON(!irqs_disabled()); -- -+#endif - if (llist_empty(list)) - return; - -@@ -177,7 +193,16 @@ static void irq_work_run_list(struct llist_head *list) - void irq_work_run(void) - { - irq_work_run_list(this_cpu_ptr(&raised_list)); -- irq_work_run_list(this_cpu_ptr(&lazy_list)); -+ if (IS_ENABLED(CONFIG_PREEMPT_RT)) { -+ /* -+ * NOTE: we raise softirq via IPI for safety, -+ * and execute in irq_work_tick() to move the -+ * overhead from hard to soft irq context. -+ */ -+ if (!llist_empty(this_cpu_ptr(&lazy_list))) -+ raise_softirq(TIMER_SOFTIRQ); -+ } else -+ irq_work_run_list(this_cpu_ptr(&lazy_list)); - } - EXPORT_SYMBOL_GPL(irq_work_run); - -@@ -187,8 +212,17 @@ void irq_work_tick(void) - - if (!llist_empty(raised) && !arch_irq_work_has_interrupt()) - irq_work_run_list(raised); -+ -+ if (!IS_ENABLED(CONFIG_PREEMPT_RT)) -+ irq_work_run_list(this_cpu_ptr(&lazy_list)); -+} -+ -+#if defined(CONFIG_IRQ_WORK) && defined(CONFIG_PREEMPT_RT) -+void irq_work_tick_soft(void) -+{ - irq_work_run_list(this_cpu_ptr(&lazy_list)); - } -+#endif - - /* - * Synchronize against the irq_work @entry, ensures the entry is not -diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c -index dd7770226086..c2d1d1ede0f4 100644 ---- a/kernel/sched/topology.c -+++ b/kernel/sched/topology.c -@@ -514,6 +514,7 @@ static int init_rootdomain(struct root_domain *rd) - rd->rto_cpu = -1; - raw_spin_lock_init(&rd->rto_lock); - init_irq_work(&rd->rto_push_work, rto_push_irq_work_func); -+ atomic_or(IRQ_WORK_HARD_IRQ, &rd->rto_push_work.flags); - #endif - - init_dl_bw(&rd->dl_bw); -diff --git a/kernel/time/timer.c b/kernel/time/timer.c -index 14d9eb790b31..b6477db234e6 100644 ---- a/kernel/time/timer.c -+++ b/kernel/time/timer.c -@@ -1766,6 +1766,8 @@ static __latent_entropy void run_timer_softirq(struct softirq_action *h) - { - struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_STD]); - -+ irq_work_tick_soft(); -+ - __run_timers(base); - if (IS_ENABLED(CONFIG_NO_HZ_COMMON)) - __run_timers(this_cpu_ptr(&timer_bases[BASE_DEF])); --- -2.30.2 - diff --git a/debian/patches-rt/0245-x86-crypto-Reduce-preempt-disabled-regions.patch b/debian/patches-rt/0245-x86-crypto-Reduce-preempt-disabled-regions.patch deleted file mode 100644 index 2adc2a48d..000000000 --- a/debian/patches-rt/0245-x86-crypto-Reduce-preempt-disabled-regions.patch +++ /dev/null @@ -1,118 +0,0 @@ -From ea01714abd36e1dcafdd07d9ac40a707d550487f Mon Sep 17 00:00:00 2001 -From: Peter Zijlstra <peterz@infradead.org> -Date: Mon, 14 Nov 2011 18:19:27 +0100 -Subject: [PATCH 245/296] x86: crypto: Reduce preempt disabled regions -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Restrict the preempt disabled regions to the actual floating point -operations and enable preemption for the administrative actions. - -This is necessary on RT to avoid that kfree and other operations are -called with preemption disabled. - -Reported-and-tested-by: Carsten Emde <cbe@osadl.org> -Signed-off-by: Peter Zijlstra <peterz@infradead.org> - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - arch/x86/crypto/aesni-intel_glue.c | 22 ++++++++++++---------- - 1 file changed, 12 insertions(+), 10 deletions(-) - -diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c -index be891fdf8d17..29c716ed103f 100644 ---- a/arch/x86/crypto/aesni-intel_glue.c -+++ b/arch/x86/crypto/aesni-intel_glue.c -@@ -379,14 +379,14 @@ static int ecb_encrypt(struct skcipher_request *req) - - err = skcipher_walk_virt(&walk, req, true); - -- kernel_fpu_begin(); - while ((nbytes = walk.nbytes)) { -+ kernel_fpu_begin(); - aesni_ecb_enc(ctx, walk.dst.virt.addr, walk.src.virt.addr, - nbytes & AES_BLOCK_MASK); -+ kernel_fpu_end(); - nbytes &= AES_BLOCK_SIZE - 1; - err = skcipher_walk_done(&walk, nbytes); - } -- kernel_fpu_end(); - - return err; - } -@@ -401,14 +401,14 @@ static int ecb_decrypt(struct skcipher_request *req) - - err = skcipher_walk_virt(&walk, req, true); - -- kernel_fpu_begin(); - while ((nbytes = walk.nbytes)) { -+ kernel_fpu_begin(); - aesni_ecb_dec(ctx, walk.dst.virt.addr, walk.src.virt.addr, - nbytes & AES_BLOCK_MASK); -+ kernel_fpu_end(); - nbytes &= AES_BLOCK_SIZE - 1; - err = skcipher_walk_done(&walk, nbytes); - } -- kernel_fpu_end(); - - return err; - } -@@ -423,14 +423,14 @@ static int cbc_encrypt(struct skcipher_request *req) - - err = skcipher_walk_virt(&walk, req, true); - -- kernel_fpu_begin(); - while ((nbytes = walk.nbytes)) { -+ kernel_fpu_begin(); - aesni_cbc_enc(ctx, walk.dst.virt.addr, walk.src.virt.addr, - nbytes & AES_BLOCK_MASK, walk.iv); -+ kernel_fpu_end(); - nbytes &= AES_BLOCK_SIZE - 1; - err = skcipher_walk_done(&walk, nbytes); - } -- kernel_fpu_end(); - - return err; - } -@@ -445,14 +445,14 @@ static int cbc_decrypt(struct skcipher_request *req) - - err = skcipher_walk_virt(&walk, req, true); - -- kernel_fpu_begin(); - while ((nbytes = walk.nbytes)) { -+ kernel_fpu_begin(); - aesni_cbc_dec(ctx, walk.dst.virt.addr, walk.src.virt.addr, - nbytes & AES_BLOCK_MASK, walk.iv); -+ kernel_fpu_end(); - nbytes &= AES_BLOCK_SIZE - 1; - err = skcipher_walk_done(&walk, nbytes); - } -- kernel_fpu_end(); - - return err; - } -@@ -500,18 +500,20 @@ static int ctr_crypt(struct skcipher_request *req) - - err = skcipher_walk_virt(&walk, req, true); - -- kernel_fpu_begin(); - while ((nbytes = walk.nbytes) >= AES_BLOCK_SIZE) { -+ kernel_fpu_begin(); - aesni_ctr_enc_tfm(ctx, walk.dst.virt.addr, walk.src.virt.addr, - nbytes & AES_BLOCK_MASK, walk.iv); -+ kernel_fpu_end(); - nbytes &= AES_BLOCK_SIZE - 1; - err = skcipher_walk_done(&walk, nbytes); - } - if (walk.nbytes) { -+ kernel_fpu_begin(); - ctr_crypt_final(ctx, &walk); -+ kernel_fpu_end(); - err = skcipher_walk_done(&walk, 0); - } -- kernel_fpu_end(); - - return err; - } --- -2.30.2 - diff --git a/debian/patches-rt/0246-crypto-Reduce-preempt-disabled-regions-more-algos.patch b/debian/patches-rt/0246-crypto-Reduce-preempt-disabled-regions-more-algos.patch deleted file mode 100644 index 8c5ac27a0..000000000 --- a/debian/patches-rt/0246-crypto-Reduce-preempt-disabled-regions-more-algos.patch +++ /dev/null @@ -1,241 +0,0 @@ -From f202e5bb17965a13442590c7151da4784f4dc46c Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Fri, 21 Feb 2014 17:24:04 +0100 -Subject: [PATCH 246/296] crypto: Reduce preempt disabled regions, more algos -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Don Estabrook reported -| kernel: WARNING: CPU: 2 PID: 858 at kernel/sched/core.c:2428 migrate_disable+0xed/0x100() -| kernel: WARNING: CPU: 2 PID: 858 at kernel/sched/core.c:2462 migrate_enable+0x17b/0x200() -| kernel: WARNING: CPU: 3 PID: 865 at kernel/sched/core.c:2428 migrate_disable+0xed/0x100() - -and his backtrace showed some crypto functions which looked fine. - -The problem is the following sequence: - -glue_xts_crypt_128bit() -{ - blkcipher_walk_virt(); /* normal migrate_disable() */ - - glue_fpu_begin(); /* get atomic */ - - while (nbytes) { - __glue_xts_crypt_128bit(); - blkcipher_walk_done(); /* with nbytes = 0, migrate_enable() - * while we are atomic */ - }; - glue_fpu_end() /* no longer atomic */ -} - -and this is why the counter get out of sync and the warning is printed. -The other problem is that we are non-preemptible between -glue_fpu_begin() and glue_fpu_end() and the latency grows. To fix this, -I shorten the FPU off region and ensure blkcipher_walk_done() is called -with preemption enabled. This might hurt the performance because we now -enable/disable the FPU state more often but we gain lower latency and -the bug is gone. - - -Reported-by: Don Estabrook <don.estabrook@gmail.com> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/x86/crypto/cast5_avx_glue.c | 21 +++++++++------------ - arch/x86/crypto/glue_helper.c | 26 +++++++++++++++----------- - 2 files changed, 24 insertions(+), 23 deletions(-) - -diff --git a/arch/x86/crypto/cast5_avx_glue.c b/arch/x86/crypto/cast5_avx_glue.c -index 384ccb00f9e1..2f8df8ef8644 100644 ---- a/arch/x86/crypto/cast5_avx_glue.c -+++ b/arch/x86/crypto/cast5_avx_glue.c -@@ -46,7 +46,7 @@ static inline void cast5_fpu_end(bool fpu_enabled) - - static int ecb_crypt(struct skcipher_request *req, bool enc) - { -- bool fpu_enabled = false; -+ bool fpu_enabled; - struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); - struct cast5_ctx *ctx = crypto_skcipher_ctx(tfm); - struct skcipher_walk walk; -@@ -61,7 +61,7 @@ static int ecb_crypt(struct skcipher_request *req, bool enc) - u8 *wsrc = walk.src.virt.addr; - u8 *wdst = walk.dst.virt.addr; - -- fpu_enabled = cast5_fpu_begin(fpu_enabled, &walk, nbytes); -+ fpu_enabled = cast5_fpu_begin(false, &walk, nbytes); - - /* Process multi-block batch */ - if (nbytes >= bsize * CAST5_PARALLEL_BLOCKS) { -@@ -90,10 +90,9 @@ static int ecb_crypt(struct skcipher_request *req, bool enc) - } while (nbytes >= bsize); - - done: -+ cast5_fpu_end(fpu_enabled); - err = skcipher_walk_done(&walk, nbytes); - } -- -- cast5_fpu_end(fpu_enabled); - return err; - } - -@@ -197,7 +196,7 @@ static int cbc_decrypt(struct skcipher_request *req) - { - struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); - struct cast5_ctx *ctx = crypto_skcipher_ctx(tfm); -- bool fpu_enabled = false; -+ bool fpu_enabled; - struct skcipher_walk walk; - unsigned int nbytes; - int err; -@@ -205,12 +204,11 @@ static int cbc_decrypt(struct skcipher_request *req) - err = skcipher_walk_virt(&walk, req, false); - - while ((nbytes = walk.nbytes)) { -- fpu_enabled = cast5_fpu_begin(fpu_enabled, &walk, nbytes); -+ fpu_enabled = cast5_fpu_begin(false, &walk, nbytes); - nbytes = __cbc_decrypt(ctx, &walk); -+ cast5_fpu_end(fpu_enabled); - err = skcipher_walk_done(&walk, nbytes); - } -- -- cast5_fpu_end(fpu_enabled); - return err; - } - -@@ -277,7 +275,7 @@ static int ctr_crypt(struct skcipher_request *req) - { - struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); - struct cast5_ctx *ctx = crypto_skcipher_ctx(tfm); -- bool fpu_enabled = false; -+ bool fpu_enabled; - struct skcipher_walk walk; - unsigned int nbytes; - int err; -@@ -285,13 +283,12 @@ static int ctr_crypt(struct skcipher_request *req) - err = skcipher_walk_virt(&walk, req, false); - - while ((nbytes = walk.nbytes) >= CAST5_BLOCK_SIZE) { -- fpu_enabled = cast5_fpu_begin(fpu_enabled, &walk, nbytes); -+ fpu_enabled = cast5_fpu_begin(false, &walk, nbytes); - nbytes = __ctr_crypt(&walk, ctx); -+ cast5_fpu_end(fpu_enabled); - err = skcipher_walk_done(&walk, nbytes); - } - -- cast5_fpu_end(fpu_enabled); -- - if (walk.nbytes) { - ctr_crypt_final(&walk, ctx); - err = skcipher_walk_done(&walk, 0); -diff --git a/arch/x86/crypto/glue_helper.c b/arch/x86/crypto/glue_helper.c -index d3d91a0abf88..6d0774721514 100644 ---- a/arch/x86/crypto/glue_helper.c -+++ b/arch/x86/crypto/glue_helper.c -@@ -24,7 +24,7 @@ int glue_ecb_req_128bit(const struct common_glue_ctx *gctx, - void *ctx = crypto_skcipher_ctx(crypto_skcipher_reqtfm(req)); - const unsigned int bsize = 128 / 8; - struct skcipher_walk walk; -- bool fpu_enabled = false; -+ bool fpu_enabled; - unsigned int nbytes; - int err; - -@@ -37,7 +37,7 @@ int glue_ecb_req_128bit(const struct common_glue_ctx *gctx, - unsigned int i; - - fpu_enabled = glue_fpu_begin(bsize, gctx->fpu_blocks_limit, -- &walk, fpu_enabled, nbytes); -+ &walk, false, nbytes); - for (i = 0; i < gctx->num_funcs; i++) { - func_bytes = bsize * gctx->funcs[i].num_blocks; - -@@ -55,10 +55,9 @@ int glue_ecb_req_128bit(const struct common_glue_ctx *gctx, - if (nbytes < bsize) - break; - } -+ glue_fpu_end(fpu_enabled); - err = skcipher_walk_done(&walk, nbytes); - } -- -- glue_fpu_end(fpu_enabled); - return err; - } - EXPORT_SYMBOL_GPL(glue_ecb_req_128bit); -@@ -101,7 +100,7 @@ int glue_cbc_decrypt_req_128bit(const struct common_glue_ctx *gctx, - void *ctx = crypto_skcipher_ctx(crypto_skcipher_reqtfm(req)); - const unsigned int bsize = 128 / 8; - struct skcipher_walk walk; -- bool fpu_enabled = false; -+ bool fpu_enabled; - unsigned int nbytes; - int err; - -@@ -115,7 +114,7 @@ int glue_cbc_decrypt_req_128bit(const struct common_glue_ctx *gctx, - u128 last_iv; - - fpu_enabled = glue_fpu_begin(bsize, gctx->fpu_blocks_limit, -- &walk, fpu_enabled, nbytes); -+ &walk, false, nbytes); - /* Start of the last block. */ - src += nbytes / bsize - 1; - dst += nbytes / bsize - 1; -@@ -148,10 +147,10 @@ int glue_cbc_decrypt_req_128bit(const struct common_glue_ctx *gctx, - done: - u128_xor(dst, dst, (u128 *)walk.iv); - *(u128 *)walk.iv = last_iv; -+ glue_fpu_end(fpu_enabled); - err = skcipher_walk_done(&walk, nbytes); - } - -- glue_fpu_end(fpu_enabled); - return err; - } - EXPORT_SYMBOL_GPL(glue_cbc_decrypt_req_128bit); -@@ -162,7 +161,7 @@ int glue_ctr_req_128bit(const struct common_glue_ctx *gctx, - void *ctx = crypto_skcipher_ctx(crypto_skcipher_reqtfm(req)); - const unsigned int bsize = 128 / 8; - struct skcipher_walk walk; -- bool fpu_enabled = false; -+ bool fpu_enabled; - unsigned int nbytes; - int err; - -@@ -176,7 +175,7 @@ int glue_ctr_req_128bit(const struct common_glue_ctx *gctx, - le128 ctrblk; - - fpu_enabled = glue_fpu_begin(bsize, gctx->fpu_blocks_limit, -- &walk, fpu_enabled, nbytes); -+ &walk, false, nbytes); - - be128_to_le128(&ctrblk, (be128 *)walk.iv); - -@@ -202,11 +201,10 @@ int glue_ctr_req_128bit(const struct common_glue_ctx *gctx, - } - - le128_to_be128((be128 *)walk.iv, &ctrblk); -+ glue_fpu_end(fpu_enabled); - err = skcipher_walk_done(&walk, nbytes); - } - -- glue_fpu_end(fpu_enabled); -- - if (nbytes) { - le128 ctrblk; - u128 tmp; -@@ -306,8 +304,14 @@ int glue_xts_req_128bit(const struct common_glue_ctx *gctx, - tweak_fn(tweak_ctx, walk.iv, walk.iv); - - while (nbytes) { -+ fpu_enabled = glue_fpu_begin(bsize, gctx->fpu_blocks_limit, -+ &walk, fpu_enabled, -+ nbytes < bsize ? bsize : nbytes); - nbytes = __glue_xts_req_128bit(gctx, crypt_ctx, &walk); - -+ glue_fpu_end(fpu_enabled); -+ fpu_enabled = false; -+ - err = skcipher_walk_done(&walk, nbytes); - nbytes = walk.nbytes; - } --- -2.30.2 - diff --git a/debian/patches-rt/0247-crypto-limit-more-FPU-enabled-sections.patch b/debian/patches-rt/0247-crypto-limit-more-FPU-enabled-sections.patch deleted file mode 100644 index 5bb6d6824..000000000 --- a/debian/patches-rt/0247-crypto-limit-more-FPU-enabled-sections.patch +++ /dev/null @@ -1,74 +0,0 @@ -From 6c092aff3973bf928980ae1c582b91b9050311cb Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Thu, 30 Nov 2017 13:40:10 +0100 -Subject: [PATCH 247/296] crypto: limit more FPU-enabled sections -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Those crypto drivers use SSE/AVX/… for their crypto work and in order to -do so in kernel they need to enable the "FPU" in kernel mode which -disables preemption. -There are two problems with the way they are used: -- the while loop which processes X bytes may create latency spikes and - should be avoided or limited. -- the cipher-walk-next part may allocate/free memory and may use - kmap_atomic(). - -The whole kernel_fpu_begin()/end() processing isn't probably that cheap. -It most likely makes sense to process as much of those as possible in one -go. The new *_fpu_sched_rt() schedules only if a RT task is pending. - -Probably we should measure the performance those ciphers in pure SW -mode and with this optimisations to see if it makes sense to keep them -for RT. - -This kernel_fpu_resched() makes the code more preemptible which might hurt -performance. - -Cc: stable-rt@vger.kernel.org -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/x86/include/asm/fpu/api.h | 1 + - arch/x86/kernel/fpu/core.c | 12 ++++++++++++ - 2 files changed, 13 insertions(+) - -diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h -index 67a4f1cb2aac..41d3be7da969 100644 ---- a/arch/x86/include/asm/fpu/api.h -+++ b/arch/x86/include/asm/fpu/api.h -@@ -28,6 +28,7 @@ extern void kernel_fpu_begin_mask(unsigned int kfpu_mask); - extern void kernel_fpu_end(void); - extern bool irq_fpu_usable(void); - extern void fpregs_mark_activate(void); -+extern void kernel_fpu_resched(void); - - /* Code that is unaware of kernel_fpu_begin_mask() can use this */ - static inline void kernel_fpu_begin(void) -diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c -index 571220ac8bea..d315d45b64fa 100644 ---- a/arch/x86/kernel/fpu/core.c -+++ b/arch/x86/kernel/fpu/core.c -@@ -159,6 +159,18 @@ void kernel_fpu_end(void) - } - EXPORT_SYMBOL_GPL(kernel_fpu_end); - -+void kernel_fpu_resched(void) -+{ -+ WARN_ON_FPU(!this_cpu_read(in_kernel_fpu)); -+ -+ if (should_resched(PREEMPT_OFFSET)) { -+ kernel_fpu_end(); -+ cond_resched(); -+ kernel_fpu_begin(); -+ } -+} -+EXPORT_SYMBOL_GPL(kernel_fpu_resched); -+ - /* - * Save the FPU state (mark it for reload if necessary): - * --- -2.30.2 - diff --git a/debian/patches-rt/0254-lockdep-selftest-Only-do-hardirq-context-test-for-ra.patch b/debian/patches-rt/0254-lockdep-selftest-Only-do-hardirq-context-test-for-ra.patch deleted file mode 100644 index 6b77d4e56..000000000 --- a/debian/patches-rt/0254-lockdep-selftest-Only-do-hardirq-context-test-for-ra.patch +++ /dev/null @@ -1,62 +0,0 @@ -From add0d393b892644c3be3775b1d117855b4b22fe7 Mon Sep 17 00:00:00 2001 -From: Yong Zhang <yong.zhang@windriver.com> -Date: Mon, 16 Apr 2012 15:01:56 +0800 -Subject: [PATCH 254/296] lockdep: selftest: Only do hardirq context test for - raw spinlock -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -On -rt there is no softirq context any more and rwlock is sleepable, -disable softirq context test and rwlock+irq test. - -Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> -Cc: Yong Zhang <yong.zhang@windriver.com> -Link: http://lkml.kernel.org/r/1334559716-18447-3-git-send-email-yong.zhang0@gmail.com -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - lib/locking-selftest.c | 23 +++++++++++++++++++++++ - 1 file changed, 23 insertions(+) - -diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c -index a899b3f0e2e5..07539643403e 100644 ---- a/lib/locking-selftest.c -+++ b/lib/locking-selftest.c -@@ -2455,6 +2455,7 @@ void locking_selftest(void) - - printk(" --------------------------------------------------------------------------\n"); - -+#ifndef CONFIG_PREEMPT_RT - /* - * irq-context testcases: - */ -@@ -2469,6 +2470,28 @@ void locking_selftest(void) - DO_TESTCASE_6x2x2RW("irq read-recursion #2", irq_read_recursion2); - DO_TESTCASE_6x2x2RW("irq read-recursion #3", irq_read_recursion3); - -+#else -+ /* On -rt, we only do hardirq context test for raw spinlock */ -+ DO_TESTCASE_1B("hard-irqs-on + irq-safe-A", irqsafe1_hard_spin, 12); -+ DO_TESTCASE_1B("hard-irqs-on + irq-safe-A", irqsafe1_hard_spin, 21); -+ -+ DO_TESTCASE_1B("hard-safe-A + irqs-on", irqsafe2B_hard_spin, 12); -+ DO_TESTCASE_1B("hard-safe-A + irqs-on", irqsafe2B_hard_spin, 21); -+ -+ DO_TESTCASE_1B("hard-safe-A + unsafe-B #1", irqsafe3_hard_spin, 123); -+ DO_TESTCASE_1B("hard-safe-A + unsafe-B #1", irqsafe3_hard_spin, 132); -+ DO_TESTCASE_1B("hard-safe-A + unsafe-B #1", irqsafe3_hard_spin, 213); -+ DO_TESTCASE_1B("hard-safe-A + unsafe-B #1", irqsafe3_hard_spin, 231); -+ DO_TESTCASE_1B("hard-safe-A + unsafe-B #1", irqsafe3_hard_spin, 312); -+ DO_TESTCASE_1B("hard-safe-A + unsafe-B #1", irqsafe3_hard_spin, 321); -+ -+ DO_TESTCASE_1B("hard-safe-A + unsafe-B #2", irqsafe4_hard_spin, 123); -+ DO_TESTCASE_1B("hard-safe-A + unsafe-B #2", irqsafe4_hard_spin, 132); -+ DO_TESTCASE_1B("hard-safe-A + unsafe-B #2", irqsafe4_hard_spin, 213); -+ DO_TESTCASE_1B("hard-safe-A + unsafe-B #2", irqsafe4_hard_spin, 231); -+ DO_TESTCASE_1B("hard-safe-A + unsafe-B #2", irqsafe4_hard_spin, 312); -+ DO_TESTCASE_1B("hard-safe-A + unsafe-B #2", irqsafe4_hard_spin, 321); -+#endif - ww_tests(); - - force_read_lock_recursive = 0; --- -2.30.2 - diff --git a/debian/patches-rt/0255-lockdep-selftest-fix-warnings-due-to-missing-PREEMPT.patch b/debian/patches-rt/0255-lockdep-selftest-fix-warnings-due-to-missing-PREEMPT.patch deleted file mode 100644 index d5700b9b6..000000000 --- a/debian/patches-rt/0255-lockdep-selftest-fix-warnings-due-to-missing-PREEMPT.patch +++ /dev/null @@ -1,150 +0,0 @@ -From c749d0a6937286a5bce4f610be2ab8e56c29d933 Mon Sep 17 00:00:00 2001 -From: Josh Cartwright <josh.cartwright@ni.com> -Date: Wed, 28 Jan 2015 13:08:45 -0600 -Subject: [PATCH 255/296] lockdep: selftest: fix warnings due to missing - PREEMPT_RT conditionals -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -"lockdep: Selftest: Only do hardirq context test for raw spinlock" -disabled the execution of certain tests with PREEMPT_RT, but did -not prevent the tests from still being defined. This leads to warnings -like: - - ./linux/lib/locking-selftest.c:574:1: warning: 'irqsafe1_hard_rlock_12' defined but not used [-Wunused-function] - ./linux/lib/locking-selftest.c:574:1: warning: 'irqsafe1_hard_rlock_21' defined but not used [-Wunused-function] - ./linux/lib/locking-selftest.c:577:1: warning: 'irqsafe1_hard_wlock_12' defined but not used [-Wunused-function] - ./linux/lib/locking-selftest.c:577:1: warning: 'irqsafe1_hard_wlock_21' defined but not used [-Wunused-function] - ./linux/lib/locking-selftest.c:580:1: warning: 'irqsafe1_soft_spin_12' defined but not used [-Wunused-function] - ... - -Fixed by wrapping the test definitions in #ifndef CONFIG_PREEMPT_RT -conditionals. - - -Signed-off-by: Josh Cartwright <josh.cartwright@ni.com> -Signed-off-by: Xander Huff <xander.huff@ni.com> -Acked-by: Gratian Crisan <gratian.crisan@ni.com> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - lib/locking-selftest.c | 28 ++++++++++++++++++++++++++++ - 1 file changed, 28 insertions(+) - -diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c -index 07539643403e..a2baedfff9ee 100644 ---- a/lib/locking-selftest.c -+++ b/lib/locking-selftest.c -@@ -786,6 +786,8 @@ GENERATE_TESTCASE(init_held_rtmutex); - #include "locking-selftest-spin-hardirq.h" - GENERATE_PERMUTATIONS_2_EVENTS(irqsafe1_hard_spin) - -+#ifndef CONFIG_PREEMPT_RT -+ - #include "locking-selftest-rlock-hardirq.h" - GENERATE_PERMUTATIONS_2_EVENTS(irqsafe1_hard_rlock) - -@@ -801,9 +803,12 @@ GENERATE_PERMUTATIONS_2_EVENTS(irqsafe1_soft_rlock) - #include "locking-selftest-wlock-softirq.h" - GENERATE_PERMUTATIONS_2_EVENTS(irqsafe1_soft_wlock) - -+#endif -+ - #undef E1 - #undef E2 - -+#ifndef CONFIG_PREEMPT_RT - /* - * Enabling hardirqs with a softirq-safe lock held: - */ -@@ -836,6 +841,8 @@ GENERATE_PERMUTATIONS_2_EVENTS(irqsafe2A_rlock) - #undef E1 - #undef E2 - -+#endif -+ - /* - * Enabling irqs with an irq-safe lock held: - */ -@@ -859,6 +866,8 @@ GENERATE_PERMUTATIONS_2_EVENTS(irqsafe2A_rlock) - #include "locking-selftest-spin-hardirq.h" - GENERATE_PERMUTATIONS_2_EVENTS(irqsafe2B_hard_spin) - -+#ifndef CONFIG_PREEMPT_RT -+ - #include "locking-selftest-rlock-hardirq.h" - GENERATE_PERMUTATIONS_2_EVENTS(irqsafe2B_hard_rlock) - -@@ -874,6 +883,8 @@ GENERATE_PERMUTATIONS_2_EVENTS(irqsafe2B_soft_rlock) - #include "locking-selftest-wlock-softirq.h" - GENERATE_PERMUTATIONS_2_EVENTS(irqsafe2B_soft_wlock) - -+#endif -+ - #undef E1 - #undef E2 - -@@ -905,6 +916,8 @@ GENERATE_PERMUTATIONS_2_EVENTS(irqsafe2B_soft_wlock) - #include "locking-selftest-spin-hardirq.h" - GENERATE_PERMUTATIONS_3_EVENTS(irqsafe3_hard_spin) - -+#ifndef CONFIG_PREEMPT_RT -+ - #include "locking-selftest-rlock-hardirq.h" - GENERATE_PERMUTATIONS_3_EVENTS(irqsafe3_hard_rlock) - -@@ -920,6 +933,8 @@ GENERATE_PERMUTATIONS_3_EVENTS(irqsafe3_soft_rlock) - #include "locking-selftest-wlock-softirq.h" - GENERATE_PERMUTATIONS_3_EVENTS(irqsafe3_soft_wlock) - -+#endif -+ - #undef E1 - #undef E2 - #undef E3 -@@ -953,6 +968,8 @@ GENERATE_PERMUTATIONS_3_EVENTS(irqsafe3_soft_wlock) - #include "locking-selftest-spin-hardirq.h" - GENERATE_PERMUTATIONS_3_EVENTS(irqsafe4_hard_spin) - -+#ifndef CONFIG_PREEMPT_RT -+ - #include "locking-selftest-rlock-hardirq.h" - GENERATE_PERMUTATIONS_3_EVENTS(irqsafe4_hard_rlock) - -@@ -968,10 +985,14 @@ GENERATE_PERMUTATIONS_3_EVENTS(irqsafe4_soft_rlock) - #include "locking-selftest-wlock-softirq.h" - GENERATE_PERMUTATIONS_3_EVENTS(irqsafe4_soft_wlock) - -+#endif -+ - #undef E1 - #undef E2 - #undef E3 - -+#ifndef CONFIG_PREEMPT_RT -+ - /* - * read-lock / write-lock irq inversion. - * -@@ -1161,6 +1182,11 @@ GENERATE_PERMUTATIONS_3_EVENTS(W1W2_R2R3_R3W1) - #undef E1 - #undef E2 - #undef E3 -+ -+#endif -+ -+#ifndef CONFIG_PREEMPT_RT -+ - /* - * read-lock / write-lock recursion that is actually safe. - */ -@@ -1207,6 +1233,8 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft_wlock) - #undef E2 - #undef E3 - -+#endif -+ - /* - * read-lock / write-lock recursion that is unsafe. - */ --- -2.30.2 - diff --git a/debian/patches-rt/0256-lockdep-disable-self-test.patch b/debian/patches-rt/0256-lockdep-disable-self-test.patch deleted file mode 100644 index e5a620c64..000000000 --- a/debian/patches-rt/0256-lockdep-disable-self-test.patch +++ /dev/null @@ -1,35 +0,0 @@ -From 8f9a320c168bdc516cddd62609280d8b19d49b8f Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Tue, 17 Oct 2017 16:36:18 +0200 -Subject: [PATCH 256/296] lockdep: disable self-test -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The self-test wasn't always 100% accurate for RT. We disabled a few -tests which failed because they had a different semantic for RT. Some -still reported false positives. Now the selftest locks up the system -during boot and it needs to be investigated… - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - lib/Kconfig.debug | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug -index dcf4a9028e16..1751f130e783 100644 ---- a/lib/Kconfig.debug -+++ b/lib/Kconfig.debug -@@ -1330,7 +1330,7 @@ config DEBUG_ATOMIC_SLEEP - - config DEBUG_LOCKING_API_SELFTESTS - bool "Locking API boot-time self-tests" -- depends on DEBUG_KERNEL -+ depends on DEBUG_KERNEL && !PREEMPT_RT - help - Say Y here if you want the kernel to run a short self-test during - bootup. The self-test checks whether common types of locking bugs --- -2.30.2 - diff --git a/debian/patches-rt/0257-drm-radeon-i915-Use-preempt_disable-enable_rt-where-.patch b/debian/patches-rt/0257-drm-radeon-i915-Use-preempt_disable-enable_rt-where-.patch deleted file mode 100644 index 79fed2482..000000000 --- a/debian/patches-rt/0257-drm-radeon-i915-Use-preempt_disable-enable_rt-where-.patch +++ /dev/null @@ -1,61 +0,0 @@ -From eb2defa6c252e19d67354f6374143b75c1a6fac3 Mon Sep 17 00:00:00 2001 -From: Mike Galbraith <umgwanakikbuti@gmail.com> -Date: Sat, 27 Feb 2016 08:09:11 +0100 -Subject: [PATCH 257/296] drm,radeon,i915: Use preempt_disable/enable_rt() - where recommended -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -DRM folks identified the spots, so use them. - -Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com> -Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Cc: linux-rt-users <linux-rt-users@vger.kernel.org> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - drivers/gpu/drm/i915/i915_irq.c | 2 ++ - drivers/gpu/drm/radeon/radeon_display.c | 2 ++ - 2 files changed, 4 insertions(+) - -diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c -index 759f523c6a6b..7339a42ab2b8 100644 ---- a/drivers/gpu/drm/i915/i915_irq.c -+++ b/drivers/gpu/drm/i915/i915_irq.c -@@ -847,6 +847,7 @@ static bool i915_get_crtc_scanoutpos(struct drm_crtc *_crtc, - spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); - - /* preempt_disable_rt() should go right here in PREEMPT_RT patchset. */ -+ preempt_disable_rt(); - - /* Get optional system timestamp before query. */ - if (stime) -@@ -898,6 +899,7 @@ static bool i915_get_crtc_scanoutpos(struct drm_crtc *_crtc, - *etime = ktime_get(); - - /* preempt_enable_rt() should go right here in PREEMPT_RT patchset. */ -+ preempt_enable_rt(); - - spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); - -diff --git a/drivers/gpu/drm/radeon/radeon_display.c b/drivers/gpu/drm/radeon/radeon_display.c -index e0ae911ef427..781edf550436 100644 ---- a/drivers/gpu/drm/radeon/radeon_display.c -+++ b/drivers/gpu/drm/radeon/radeon_display.c -@@ -1822,6 +1822,7 @@ int radeon_get_crtc_scanoutpos(struct drm_device *dev, unsigned int pipe, - struct radeon_device *rdev = dev->dev_private; - - /* preempt_disable_rt() should go right here in PREEMPT_RT patchset. */ -+ preempt_disable_rt(); - - /* Get optional system timestamp before query. */ - if (stime) -@@ -1914,6 +1915,7 @@ int radeon_get_crtc_scanoutpos(struct drm_device *dev, unsigned int pipe, - *etime = ktime_get(); - - /* preempt_enable_rt() should go right here in PREEMPT_RT patchset. */ -+ preempt_enable_rt(); - - /* Decode into vertical and horizontal scanout position. */ - *vpos = position & 0x1fff; --- -2.30.2 - diff --git a/debian/patches-rt/0261-drm-i915-gt-Only-disable-interrupts-for-the-timeline.patch b/debian/patches-rt/0261-drm-i915-gt-Only-disable-interrupts-for-the-timeline.patch deleted file mode 100644 index f3e33490b..000000000 --- a/debian/patches-rt/0261-drm-i915-gt-Only-disable-interrupts-for-the-timeline.patch +++ /dev/null @@ -1,52 +0,0 @@ -From 0749672a572de55d3bb02db8ad11f2f09cba9b9a Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Tue, 7 Jul 2020 12:25:11 +0200 -Subject: [PATCH 261/296] drm/i915/gt: Only disable interrupts for the timeline - lock on !force-threaded -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -According to commit - d67739268cf0e ("drm/i915/gt: Mark up the nested engine-pm timeline lock as irqsafe") - -the intrrupts are disabled the code may be called from an interrupt -handler and from preemptible context. -With `force_irqthreads' set the timeline mutex is never observed in IRQ -context so it is not neede to disable interrupts. - -Disable only interrupts if not in `force_irqthreads' mode. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - drivers/gpu/drm/i915/gt/intel_engine_pm.c | 8 +++++--- - 1 file changed, 5 insertions(+), 3 deletions(-) - -diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c -index f7b2e07e2229..313d8a28e776 100644 ---- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c -+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c -@@ -60,9 +60,10 @@ static int __engine_unpark(struct intel_wakeref *wf) - - static inline unsigned long __timeline_mark_lock(struct intel_context *ce) - { -- unsigned long flags; -+ unsigned long flags = 0; - -- local_irq_save(flags); -+ if (!force_irqthreads) -+ local_irq_save(flags); - mutex_acquire(&ce->timeline->mutex.dep_map, 2, 0, _THIS_IP_); - - return flags; -@@ -72,7 +73,8 @@ static inline void __timeline_mark_unlock(struct intel_context *ce, - unsigned long flags) - { - mutex_release(&ce->timeline->mutex.dep_map, _THIS_IP_); -- local_irq_restore(flags); -+ if (!force_irqthreads) -+ local_irq_restore(flags); - } - - #else --- -2.30.2 - diff --git a/debian/patches-rt/0262-cpuset-Convert-callback_lock-to-raw_spinlock_t.patch b/debian/patches-rt/0262-cpuset-Convert-callback_lock-to-raw_spinlock_t.patch deleted file mode 100644 index f81aa0d9a..000000000 --- a/debian/patches-rt/0262-cpuset-Convert-callback_lock-to-raw_spinlock_t.patch +++ /dev/null @@ -1,329 +0,0 @@ -From 210e3c9bf70707c2bdfd10060622ca1d6bd49626 Mon Sep 17 00:00:00 2001 -From: Mike Galbraith <efault@gmx.de> -Date: Sun, 8 Jan 2017 09:32:25 +0100 -Subject: [PATCH 262/296] cpuset: Convert callback_lock to raw_spinlock_t -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The two commits below add up to a cpuset might_sleep() splat for RT: - -8447a0fee974 cpuset: convert callback_mutex to a spinlock -344736f29b35 cpuset: simplify cpuset_node_allowed API - -BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:995 -in_atomic(): 0, irqs_disabled(): 1, pid: 11718, name: cset -CPU: 135 PID: 11718 Comm: cset Tainted: G E 4.10.0-rt1-rt #4 -Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRHSXSD1.86B.0056.R01.1409242327 09/24/2014 -Call Trace: - ? dump_stack+0x5c/0x81 - ? ___might_sleep+0xf4/0x170 - ? rt_spin_lock+0x1c/0x50 - ? __cpuset_node_allowed+0x66/0xc0 - ? ___slab_alloc+0x390/0x570 <disables IRQs> - ? anon_vma_fork+0x8f/0x140 - ? copy_page_range+0x6cf/0xb00 - ? anon_vma_fork+0x8f/0x140 - ? __slab_alloc.isra.74+0x5a/0x81 - ? anon_vma_fork+0x8f/0x140 - ? kmem_cache_alloc+0x1b5/0x1f0 - ? anon_vma_fork+0x8f/0x140 - ? copy_process.part.35+0x1670/0x1ee0 - ? _do_fork+0xdd/0x3f0 - ? _do_fork+0xdd/0x3f0 - ? do_syscall_64+0x61/0x170 - ? entry_SYSCALL64_slow_path+0x25/0x25 - -The later ensured that a NUMA box WILL take callback_lock in atomic -context by removing the allocator and reclaim path __GFP_HARDWALL -usage which prevented such contexts from taking callback_mutex. - -One option would be to reinstate __GFP_HARDWALL protections for -RT, however, as the 8447a0fee974 changelog states: - -The callback_mutex is only used to synchronize reads/updates of cpusets' -flags and cpu/node masks. These operations should always proceed fast so -there's no reason why we can't use a spinlock instead of the mutex. - -Cc: stable-rt@vger.kernel.org -Signed-off-by: Mike Galbraith <efault@gmx.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/cgroup/cpuset.c | 70 +++++++++++++++++++++--------------------- - 1 file changed, 35 insertions(+), 35 deletions(-) - -diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c -index 53c70c470a38..8f4b2b9aa06c 100644 ---- a/kernel/cgroup/cpuset.c -+++ b/kernel/cgroup/cpuset.c -@@ -345,7 +345,7 @@ void cpuset_read_unlock(void) - percpu_up_read(&cpuset_rwsem); - } - --static DEFINE_SPINLOCK(callback_lock); -+static DEFINE_RAW_SPINLOCK(callback_lock); - - static struct workqueue_struct *cpuset_migrate_mm_wq; - -@@ -1280,7 +1280,7 @@ static int update_parent_subparts_cpumask(struct cpuset *cpuset, int cmd, - * Newly added CPUs will be removed from effective_cpus and - * newly deleted ones will be added back to effective_cpus. - */ -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - if (adding) { - cpumask_or(parent->subparts_cpus, - parent->subparts_cpus, tmp->addmask); -@@ -1299,7 +1299,7 @@ static int update_parent_subparts_cpumask(struct cpuset *cpuset, int cmd, - } - - parent->nr_subparts_cpus = cpumask_weight(parent->subparts_cpus); -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - - return cmd == partcmd_update; - } -@@ -1404,7 +1404,7 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp) - continue; - rcu_read_unlock(); - -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - - cpumask_copy(cp->effective_cpus, tmp->new_cpus); - if (cp->nr_subparts_cpus && -@@ -1435,7 +1435,7 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp) - = cpumask_weight(cp->subparts_cpus); - } - } -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - - WARN_ON(!is_in_v2_mode() && - !cpumask_equal(cp->cpus_allowed, cp->effective_cpus)); -@@ -1553,7 +1553,7 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs, - return -EINVAL; - } - -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - cpumask_copy(cs->cpus_allowed, trialcs->cpus_allowed); - - /* -@@ -1564,7 +1564,7 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs, - cs->cpus_allowed); - cs->nr_subparts_cpus = cpumask_weight(cs->subparts_cpus); - } -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - - update_cpumasks_hier(cs, &tmp); - -@@ -1758,9 +1758,9 @@ static void update_nodemasks_hier(struct cpuset *cs, nodemask_t *new_mems) - continue; - rcu_read_unlock(); - -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - cp->effective_mems = *new_mems; -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - - WARN_ON(!is_in_v2_mode() && - !nodes_equal(cp->mems_allowed, cp->effective_mems)); -@@ -1828,9 +1828,9 @@ static int update_nodemask(struct cpuset *cs, struct cpuset *trialcs, - if (retval < 0) - goto done; - -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - cs->mems_allowed = trialcs->mems_allowed; -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - - /* use trialcs->mems_allowed as a temp variable */ - update_nodemasks_hier(cs, &trialcs->mems_allowed); -@@ -1921,9 +1921,9 @@ static int update_flag(cpuset_flagbits_t bit, struct cpuset *cs, - spread_flag_changed = ((is_spread_slab(cs) != is_spread_slab(trialcs)) - || (is_spread_page(cs) != is_spread_page(trialcs))); - -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - cs->flags = trialcs->flags; -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - - if (!cpumask_empty(trialcs->cpus_allowed) && balance_flag_changed) - rebuild_sched_domains_locked(); -@@ -2432,7 +2432,7 @@ static int cpuset_common_seq_show(struct seq_file *sf, void *v) - cpuset_filetype_t type = seq_cft(sf)->private; - int ret = 0; - -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - - switch (type) { - case FILE_CPULIST: -@@ -2454,7 +2454,7 @@ static int cpuset_common_seq_show(struct seq_file *sf, void *v) - ret = -EINVAL; - } - -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - return ret; - } - -@@ -2767,14 +2767,14 @@ static int cpuset_css_online(struct cgroup_subsys_state *css) - - cpuset_inc(); - -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - if (is_in_v2_mode()) { - cpumask_copy(cs->effective_cpus, parent->effective_cpus); - cs->effective_mems = parent->effective_mems; - cs->use_parent_ecpus = true; - parent->child_ecpus_count++; - } -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - - if (!test_bit(CGRP_CPUSET_CLONE_CHILDREN, &css->cgroup->flags)) - goto out_unlock; -@@ -2801,12 +2801,12 @@ static int cpuset_css_online(struct cgroup_subsys_state *css) - } - rcu_read_unlock(); - -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - cs->mems_allowed = parent->mems_allowed; - cs->effective_mems = parent->mems_allowed; - cpumask_copy(cs->cpus_allowed, parent->cpus_allowed); - cpumask_copy(cs->effective_cpus, parent->cpus_allowed); -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - out_unlock: - percpu_up_write(&cpuset_rwsem); - put_online_cpus(); -@@ -2862,7 +2862,7 @@ static void cpuset_css_free(struct cgroup_subsys_state *css) - static void cpuset_bind(struct cgroup_subsys_state *root_css) - { - percpu_down_write(&cpuset_rwsem); -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - - if (is_in_v2_mode()) { - cpumask_copy(top_cpuset.cpus_allowed, cpu_possible_mask); -@@ -2873,7 +2873,7 @@ static void cpuset_bind(struct cgroup_subsys_state *root_css) - top_cpuset.mems_allowed = top_cpuset.effective_mems; - } - -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - percpu_up_write(&cpuset_rwsem); - } - -@@ -2970,12 +2970,12 @@ hotplug_update_tasks_legacy(struct cpuset *cs, - { - bool is_empty; - -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - cpumask_copy(cs->cpus_allowed, new_cpus); - cpumask_copy(cs->effective_cpus, new_cpus); - cs->mems_allowed = *new_mems; - cs->effective_mems = *new_mems; -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - - /* - * Don't call update_tasks_cpumask() if the cpuset becomes empty, -@@ -3012,10 +3012,10 @@ hotplug_update_tasks(struct cpuset *cs, - if (nodes_empty(*new_mems)) - *new_mems = parent_cs(cs)->effective_mems; - -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - cpumask_copy(cs->effective_cpus, new_cpus); - cs->effective_mems = *new_mems; -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - - if (cpus_updated) - update_tasks_cpumask(cs); -@@ -3170,7 +3170,7 @@ static void cpuset_hotplug_workfn(struct work_struct *work) - - /* synchronize cpus_allowed to cpu_active_mask */ - if (cpus_updated) { -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - if (!on_dfl) - cpumask_copy(top_cpuset.cpus_allowed, &new_cpus); - /* -@@ -3190,17 +3190,17 @@ static void cpuset_hotplug_workfn(struct work_struct *work) - } - } - cpumask_copy(top_cpuset.effective_cpus, &new_cpus); -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - /* we don't mess with cpumasks of tasks in top_cpuset */ - } - - /* synchronize mems_allowed to N_MEMORY */ - if (mems_updated) { -- spin_lock_irq(&callback_lock); -+ raw_spin_lock_irq(&callback_lock); - if (!on_dfl) - top_cpuset.mems_allowed = new_mems; - top_cpuset.effective_mems = new_mems; -- spin_unlock_irq(&callback_lock); -+ raw_spin_unlock_irq(&callback_lock); - update_tasks_nodemask(&top_cpuset); - } - -@@ -3301,11 +3301,11 @@ void cpuset_cpus_allowed(struct task_struct *tsk, struct cpumask *pmask) - { - unsigned long flags; - -- spin_lock_irqsave(&callback_lock, flags); -+ raw_spin_lock_irqsave(&callback_lock, flags); - rcu_read_lock(); - guarantee_online_cpus(task_cs(tsk), pmask); - rcu_read_unlock(); -- spin_unlock_irqrestore(&callback_lock, flags); -+ raw_spin_unlock_irqrestore(&callback_lock, flags); - } - - /** -@@ -3366,11 +3366,11 @@ nodemask_t cpuset_mems_allowed(struct task_struct *tsk) - nodemask_t mask; - unsigned long flags; - -- spin_lock_irqsave(&callback_lock, flags); -+ raw_spin_lock_irqsave(&callback_lock, flags); - rcu_read_lock(); - guarantee_online_mems(task_cs(tsk), &mask); - rcu_read_unlock(); -- spin_unlock_irqrestore(&callback_lock, flags); -+ raw_spin_unlock_irqrestore(&callback_lock, flags); - - return mask; - } -@@ -3462,14 +3462,14 @@ bool __cpuset_node_allowed(int node, gfp_t gfp_mask) - return true; - - /* Not hardwall and node outside mems_allowed: scan up cpusets */ -- spin_lock_irqsave(&callback_lock, flags); -+ raw_spin_lock_irqsave(&callback_lock, flags); - - rcu_read_lock(); - cs = nearest_hardwall_ancestor(task_cs(current)); - allowed = node_isset(node, cs->mems_allowed); - rcu_read_unlock(); - -- spin_unlock_irqrestore(&callback_lock, flags); -+ raw_spin_unlock_irqrestore(&callback_lock, flags); - return allowed; - } - --- -2.30.2 - diff --git a/debian/patches-rt/0264-mm-scatterlist-Do-not-disable-irqs-on-RT.patch b/debian/patches-rt/0264-mm-scatterlist-Do-not-disable-irqs-on-RT.patch deleted file mode 100644 index 8b9048abd..000000000 --- a/debian/patches-rt/0264-mm-scatterlist-Do-not-disable-irqs-on-RT.patch +++ /dev/null @@ -1,30 +0,0 @@ -From 96eca916f7d434eba1d762b03ca76f4aa9bc2c36 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Fri, 3 Jul 2009 08:44:34 -0500 -Subject: [PATCH 264/296] mm/scatterlist: Do not disable irqs on RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -For -RT it is enough to keep pagefault disabled (which is currently handled by -kmap_atomic()). - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - lib/scatterlist.c | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/lib/scatterlist.c b/lib/scatterlist.c -index a59778946404..907f59045998 100644 ---- a/lib/scatterlist.c -+++ b/lib/scatterlist.c -@@ -892,7 +892,7 @@ void sg_miter_stop(struct sg_mapping_iter *miter) - flush_kernel_dcache_page(miter->page); - - if (miter->__flags & SG_MITER_ATOMIC) { -- WARN_ON_ONCE(preemptible()); -+ WARN_ON_ONCE(!pagefault_disabled()); - kunmap_atomic(miter->addr); - } else - kunmap(miter->page); --- -2.30.2 - diff --git a/debian/patches-rt/0268-arm-Add-support-for-lazy-preemption.patch b/debian/patches-rt/0268-arm-Add-support-for-lazy-preemption.patch deleted file mode 100644 index deedbc002..000000000 --- a/debian/patches-rt/0268-arm-Add-support-for-lazy-preemption.patch +++ /dev/null @@ -1,168 +0,0 @@ -From 14e18fb408cc534ce4a0c99c177e02d4a4c8907f Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Wed, 31 Oct 2012 12:04:11 +0100 -Subject: [PATCH 268/296] arm: Add support for lazy preemption -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Implement the arm pieces for lazy preempt. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - arch/arm/Kconfig | 1 + - arch/arm/include/asm/thread_info.h | 8 ++++++-- - arch/arm/kernel/asm-offsets.c | 1 + - arch/arm/kernel/entry-armv.S | 19 ++++++++++++++++--- - arch/arm/kernel/entry-common.S | 9 +++++++-- - arch/arm/kernel/signal.c | 3 ++- - 6 files changed, 33 insertions(+), 8 deletions(-) - -diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig -index 4708ede3b826..229a806a3dd7 100644 ---- a/arch/arm/Kconfig -+++ b/arch/arm/Kconfig -@@ -105,6 +105,7 @@ config ARM - select HAVE_PERF_EVENTS - select HAVE_PERF_REGS - select HAVE_PERF_USER_STACK_DUMP -+ select HAVE_PREEMPT_LAZY - select MMU_GATHER_RCU_TABLE_FREE if SMP && ARM_LPAE - select HAVE_REGS_AND_STACK_ACCESS_API - select HAVE_RSEQ -diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h -index 536b6b979f63..875aaf9af946 100644 ---- a/arch/arm/include/asm/thread_info.h -+++ b/arch/arm/include/asm/thread_info.h -@@ -46,6 +46,7 @@ struct cpu_context_save { - struct thread_info { - unsigned long flags; /* low level flags */ - int preempt_count; /* 0 => preemptable, <0 => bug */ -+ int preempt_lazy_count; /* 0 => preemptable, <0 => bug */ - mm_segment_t addr_limit; /* address limit */ - struct task_struct *task; /* main task structure */ - __u32 cpu; /* cpu */ -@@ -134,7 +135,8 @@ extern int vfp_restore_user_hwstate(struct user_vfp *, - #define TIF_SYSCALL_TRACE 4 /* syscall trace active */ - #define TIF_SYSCALL_AUDIT 5 /* syscall auditing active */ - #define TIF_SYSCALL_TRACEPOINT 6 /* syscall tracepoint instrumentation */ --#define TIF_SECCOMP 7 /* seccomp syscall filtering active */ -+#define TIF_NEED_RESCHED_LAZY 7 -+#define TIF_SECCOMP 8 /* seccomp syscall filtering active */ - - #define TIF_USING_IWMMXT 17 - #define TIF_MEMDIE 18 /* is terminating due to OOM killer */ -@@ -143,6 +145,7 @@ extern int vfp_restore_user_hwstate(struct user_vfp *, - #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) - #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) - #define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME) -+#define _TIF_NEED_RESCHED_LAZY (1 << TIF_NEED_RESCHED_LAZY) - #define _TIF_UPROBE (1 << TIF_UPROBE) - #define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE) - #define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT) -@@ -158,7 +161,8 @@ extern int vfp_restore_user_hwstate(struct user_vfp *, - * Change these and you break ASM code in entry-common.S - */ - #define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_SIGPENDING | \ -- _TIF_NOTIFY_RESUME | _TIF_UPROBE) -+ _TIF_NOTIFY_RESUME | _TIF_UPROBE | \ -+ _TIF_NEED_RESCHED_LAZY) - - #endif /* __KERNEL__ */ - #endif /* __ASM_ARM_THREAD_INFO_H */ -diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c -index be8050b0c3df..884e40a525ce 100644 ---- a/arch/arm/kernel/asm-offsets.c -+++ b/arch/arm/kernel/asm-offsets.c -@@ -42,6 +42,7 @@ int main(void) - BLANK(); - DEFINE(TI_FLAGS, offsetof(struct thread_info, flags)); - DEFINE(TI_PREEMPT, offsetof(struct thread_info, preempt_count)); -+ DEFINE(TI_PREEMPT_LAZY, offsetof(struct thread_info, preempt_lazy_count)); - DEFINE(TI_ADDR_LIMIT, offsetof(struct thread_info, addr_limit)); - DEFINE(TI_TASK, offsetof(struct thread_info, task)); - DEFINE(TI_CPU, offsetof(struct thread_info, cpu)); -diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S -index 1c9e6d1452c5..6160eeeab6a8 100644 ---- a/arch/arm/kernel/entry-armv.S -+++ b/arch/arm/kernel/entry-armv.S -@@ -206,11 +206,18 @@ __irq_svc: - - #ifdef CONFIG_PREEMPTION - ldr r8, [tsk, #TI_PREEMPT] @ get preempt count -- ldr r0, [tsk, #TI_FLAGS] @ get flags - teq r8, #0 @ if preempt count != 0 -+ bne 1f @ return from exeption -+ ldr r0, [tsk, #TI_FLAGS] @ get flags -+ tst r0, #_TIF_NEED_RESCHED @ if NEED_RESCHED is set -+ blne svc_preempt @ preempt! -+ -+ ldr r8, [tsk, #TI_PREEMPT_LAZY] @ get preempt lazy count -+ teq r8, #0 @ if preempt lazy count != 0 - movne r0, #0 @ force flags to 0 -- tst r0, #_TIF_NEED_RESCHED -+ tst r0, #_TIF_NEED_RESCHED_LAZY - blne svc_preempt -+1: - #endif - - svc_exit r5, irq = 1 @ return from exception -@@ -225,8 +232,14 @@ svc_preempt: - 1: bl preempt_schedule_irq @ irq en/disable is done inside - ldr r0, [tsk, #TI_FLAGS] @ get new tasks TI_FLAGS - tst r0, #_TIF_NEED_RESCHED -+ bne 1b -+ tst r0, #_TIF_NEED_RESCHED_LAZY - reteq r8 @ go again -- b 1b -+ ldr r0, [tsk, #TI_PREEMPT_LAZY] @ get preempt lazy count -+ teq r0, #0 @ if preempt lazy count != 0 -+ beq 1b -+ ret r8 @ go again -+ - #endif - - __und_fault: -diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S -index 271cb8a1eba1..fd039b1b3731 100644 ---- a/arch/arm/kernel/entry-common.S -+++ b/arch/arm/kernel/entry-common.S -@@ -53,7 +53,9 @@ __ret_fast_syscall: - cmp r2, #TASK_SIZE - blne addr_limit_check_failed - ldr r1, [tsk, #TI_FLAGS] @ re-check for syscall tracing -- tst r1, #_TIF_SYSCALL_WORK | _TIF_WORK_MASK -+ tst r1, #((_TIF_SYSCALL_WORK | _TIF_WORK_MASK) & ~_TIF_SECCOMP) -+ bne fast_work_pending -+ tst r1, #_TIF_SECCOMP - bne fast_work_pending - - -@@ -90,8 +92,11 @@ __ret_fast_syscall: - cmp r2, #TASK_SIZE - blne addr_limit_check_failed - ldr r1, [tsk, #TI_FLAGS] @ re-check for syscall tracing -- tst r1, #_TIF_SYSCALL_WORK | _TIF_WORK_MASK -+ tst r1, #((_TIF_SYSCALL_WORK | _TIF_WORK_MASK) & ~_TIF_SECCOMP) -+ bne do_slower_path -+ tst r1, #_TIF_SECCOMP - beq no_work_pending -+do_slower_path: - UNWIND(.fnend ) - ENDPROC(ret_fast_syscall) - -diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c -index 2f81d3af5f9a..6e69f7b3d581 100644 ---- a/arch/arm/kernel/signal.c -+++ b/arch/arm/kernel/signal.c -@@ -649,7 +649,8 @@ do_work_pending(struct pt_regs *regs, unsigned int thread_flags, int syscall) - */ - trace_hardirqs_off(); - do { -- if (likely(thread_flags & _TIF_NEED_RESCHED)) { -+ if (likely(thread_flags & (_TIF_NEED_RESCHED | -+ _TIF_NEED_RESCHED_LAZY))) { - schedule(); - } else { - if (unlikely(!user_mode(regs))) --- -2.30.2 - diff --git a/debian/patches-rt/0269-powerpc-Add-support-for-lazy-preemption.patch b/debian/patches-rt/0269-powerpc-Add-support-for-lazy-preemption.patch deleted file mode 100644 index 8d273c9a9..000000000 --- a/debian/patches-rt/0269-powerpc-Add-support-for-lazy-preemption.patch +++ /dev/null @@ -1,267 +0,0 @@ -From f79ecf3348411d08e4dd3237af9293d7012fcb14 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Thu, 1 Nov 2012 10:14:11 +0100 -Subject: [PATCH 269/296] powerpc: Add support for lazy preemption -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Implement the powerpc pieces for lazy preempt. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - arch/powerpc/Kconfig | 1 + - arch/powerpc/include/asm/thread_info.h | 17 +++++++++++++---- - arch/powerpc/kernel/asm-offsets.c | 1 + - arch/powerpc/kernel/entry_32.S | 23 ++++++++++++++++------- - arch/powerpc/kernel/exceptions-64e.S | 16 ++++++++++++---- - arch/powerpc/kernel/syscall_64.c | 10 +++++++--- - 6 files changed, 50 insertions(+), 18 deletions(-) - -diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig -index fc7b4258ea6d..155add269fc4 100644 ---- a/arch/powerpc/Kconfig -+++ b/arch/powerpc/Kconfig -@@ -230,6 +230,7 @@ config PPC - select HAVE_HARDLOCKUP_DETECTOR_PERF if PERF_EVENTS && HAVE_PERF_EVENTS_NMI && !HAVE_HARDLOCKUP_DETECTOR_ARCH - select HAVE_PERF_REGS - select HAVE_PERF_USER_STACK_DUMP -+ select HAVE_PREEMPT_LAZY - select MMU_GATHER_RCU_TABLE_FREE - select MMU_GATHER_PAGE_SIZE - select HAVE_REGS_AND_STACK_ACCESS_API -diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h -index 46a210b03d2b..0e316b44b2d7 100644 ---- a/arch/powerpc/include/asm/thread_info.h -+++ b/arch/powerpc/include/asm/thread_info.h -@@ -48,6 +48,8 @@ - struct thread_info { - int preempt_count; /* 0 => preemptable, - <0 => BUG */ -+ int preempt_lazy_count; /* 0 => preemptable, -+ <0 => BUG */ - unsigned long local_flags; /* private flags for thread */ - #ifdef CONFIG_LIVEPATCH - unsigned long *livepatch_sp; -@@ -97,11 +99,12 @@ void arch_setup_new_exec(void); - #define TIF_SINGLESTEP 8 /* singlestepping active */ - #define TIF_NOHZ 9 /* in adaptive nohz mode */ - #define TIF_SECCOMP 10 /* secure computing */ --#define TIF_RESTOREALL 11 /* Restore all regs (implies NOERROR) */ --#define TIF_NOERROR 12 /* Force successful syscall return */ -+ -+#define TIF_NEED_RESCHED_LAZY 11 /* lazy rescheduling necessary */ -+#define TIF_SYSCALL_TRACEPOINT 12 /* syscall tracepoint instrumentation */ -+ - #define TIF_NOTIFY_RESUME 13 /* callback before returning to user */ - #define TIF_UPROBE 14 /* breakpointed or single-stepping */ --#define TIF_SYSCALL_TRACEPOINT 15 /* syscall tracepoint instrumentation */ - #define TIF_EMULATE_STACK_STORE 16 /* Is an instruction emulation - for stack store? */ - #define TIF_MEMDIE 17 /* is terminating due to OOM killer */ -@@ -110,6 +113,9 @@ void arch_setup_new_exec(void); - #endif - #define TIF_POLLING_NRFLAG 19 /* true if poll_idle() is polling TIF_NEED_RESCHED */ - #define TIF_32BIT 20 /* 32 bit binary */ -+#define TIF_RESTOREALL 21 /* Restore all regs (implies NOERROR) */ -+#define TIF_NOERROR 22 /* Force successful syscall return */ -+ - - /* as above, but as bit values */ - #define _TIF_SYSCALL_TRACE (1<<TIF_SYSCALL_TRACE) -@@ -129,6 +135,7 @@ void arch_setup_new_exec(void); - #define _TIF_SYSCALL_TRACEPOINT (1<<TIF_SYSCALL_TRACEPOINT) - #define _TIF_EMULATE_STACK_STORE (1<<TIF_EMULATE_STACK_STORE) - #define _TIF_NOHZ (1<<TIF_NOHZ) -+#define _TIF_NEED_RESCHED_LAZY (1<<TIF_NEED_RESCHED_LAZY) - #define _TIF_SYSCALL_EMU (1<<TIF_SYSCALL_EMU) - #define _TIF_SYSCALL_DOTRACE (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \ - _TIF_SECCOMP | _TIF_SYSCALL_TRACEPOINT | \ -@@ -136,8 +143,10 @@ void arch_setup_new_exec(void); - - #define _TIF_USER_WORK_MASK (_TIF_SIGPENDING | _TIF_NEED_RESCHED | \ - _TIF_NOTIFY_RESUME | _TIF_UPROBE | \ -- _TIF_RESTORE_TM | _TIF_PATCH_PENDING) -+ _TIF_RESTORE_TM | _TIF_PATCH_PENDING | \ -+ _TIF_NEED_RESCHED_LAZY) - #define _TIF_PERSYSCALL_MASK (_TIF_RESTOREALL|_TIF_NOERROR) -+#define _TIF_NEED_RESCHED_MASK (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY) - - /* Bits in local_flags */ - /* Don't move TLF_NAPPING without adjusting the code in entry_32.S */ -diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c -index 5c125255571c..597379121407 100644 ---- a/arch/powerpc/kernel/asm-offsets.c -+++ b/arch/powerpc/kernel/asm-offsets.c -@@ -189,6 +189,7 @@ int main(void) - OFFSET(TI_FLAGS, thread_info, flags); - OFFSET(TI_LOCAL_FLAGS, thread_info, local_flags); - OFFSET(TI_PREEMPT, thread_info, preempt_count); -+ OFFSET(TI_PREEMPT_LAZY, thread_info, preempt_lazy_count); - - #ifdef CONFIG_PPC64 - OFFSET(DCACHEL1BLOCKSIZE, ppc64_caches, l1d.block_size); -diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S -index 459f5d00b990..fc9517a97640 100644 ---- a/arch/powerpc/kernel/entry_32.S -+++ b/arch/powerpc/kernel/entry_32.S -@@ -414,7 +414,9 @@ ret_from_syscall: - mtmsr r10 - lwz r9,TI_FLAGS(r2) - li r8,-MAX_ERRNO -- andi. r0,r9,(_TIF_SYSCALL_DOTRACE|_TIF_SINGLESTEP|_TIF_USER_WORK_MASK|_TIF_PERSYSCALL_MASK) -+ lis r0,(_TIF_SYSCALL_DOTRACE|_TIF_SINGLESTEP|_TIF_USER_WORK_MASK|_TIF_PERSYSCALL_MASK)@h -+ ori r0,r0, (_TIF_SYSCALL_DOTRACE|_TIF_SINGLESTEP|_TIF_USER_WORK_MASK|_TIF_PERSYSCALL_MASK)@l -+ and. r0,r9,r0 - bne- syscall_exit_work - cmplw 0,r3,r8 - blt+ syscall_exit_cont -@@ -530,13 +532,13 @@ syscall_dotrace: - b syscall_dotrace_cont - - syscall_exit_work: -- andi. r0,r9,_TIF_RESTOREALL -+ andis. r0,r9,_TIF_RESTOREALL@h - beq+ 0f - REST_NVGPRS(r1) - b 2f - 0: cmplw 0,r3,r8 - blt+ 1f -- andi. r0,r9,_TIF_NOERROR -+ andis. r0,r9,_TIF_NOERROR@h - bne- 1f - lwz r11,_CCR(r1) /* Load CR */ - neg r3,r3 -@@ -545,12 +547,12 @@ syscall_exit_work: - - 1: stw r6,RESULT(r1) /* Save result */ - stw r3,GPR3(r1) /* Update return value */ --2: andi. r0,r9,(_TIF_PERSYSCALL_MASK) -+2: andis. r0,r9,(_TIF_PERSYSCALL_MASK)@h - beq 4f - - /* Clear per-syscall TIF flags if any are set. */ - -- li r11,_TIF_PERSYSCALL_MASK -+ lis r11,(_TIF_PERSYSCALL_MASK)@h - addi r12,r2,TI_FLAGS - 3: lwarx r8,0,r12 - andc r8,r8,r11 -@@ -927,7 +929,14 @@ resume_kernel: - cmpwi 0,r0,0 /* if non-zero, just restore regs and return */ - bne restore_kuap - andi. r8,r8,_TIF_NEED_RESCHED -+ bne+ 1f -+ lwz r0,TI_PREEMPT_LAZY(r2) -+ cmpwi 0,r0,0 /* if non-zero, just restore regs and return */ -+ bne restore_kuap -+ lwz r0,TI_FLAGS(r2) -+ andi. r0,r0,_TIF_NEED_RESCHED_LAZY - beq+ restore_kuap -+1: - lwz r3,_MSR(r1) - andi. r0,r3,MSR_EE /* interrupts off? */ - beq restore_kuap /* don't schedule if so */ -@@ -1248,7 +1257,7 @@ global_dbcr0: - #endif /* !(CONFIG_4xx || CONFIG_BOOKE) */ - - do_work: /* r10 contains MSR_KERNEL here */ -- andi. r0,r9,_TIF_NEED_RESCHED -+ andi. r0,r9,_TIF_NEED_RESCHED_MASK - beq do_user_signal - - do_resched: /* r10 contains MSR_KERNEL here */ -@@ -1267,7 +1276,7 @@ recheck: - LOAD_REG_IMMEDIATE(r10,MSR_KERNEL) - mtmsr r10 /* disable interrupts */ - lwz r9,TI_FLAGS(r2) -- andi. r0,r9,_TIF_NEED_RESCHED -+ andi. r0,r9,_TIF_NEED_RESCHED_MASK - bne- do_resched - andi. r0,r9,_TIF_USER_WORK_MASK - beq restore_user -diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S -index f579ce46eef2..715ff292a8f8 100644 ---- a/arch/powerpc/kernel/exceptions-64e.S -+++ b/arch/powerpc/kernel/exceptions-64e.S -@@ -1080,7 +1080,7 @@ _GLOBAL(ret_from_except_lite) - li r10, -1 - mtspr SPRN_DBSR,r10 - b restore --1: andi. r0,r4,_TIF_NEED_RESCHED -+1: andi. r0,r4,_TIF_NEED_RESCHED_MASK - beq 2f - bl restore_interrupts - SCHEDULE_USER -@@ -1132,12 +1132,20 @@ resume_kernel: - bne- 0b - 1: - --#ifdef CONFIG_PREEMPT -+#ifdef CONFIG_PREEMPTION - /* Check if we need to preempt */ -+ lwz r8,TI_PREEMPT(r9) -+ cmpwi 0,r8,0 /* if non-zero, just restore regs and return */ -+ bne restore - andi. r0,r4,_TIF_NEED_RESCHED -+ bne+ check_count -+ -+ andi. r0,r4,_TIF_NEED_RESCHED_LAZY - beq+ restore -+ lwz r8,TI_PREEMPT_LAZY(r9) -+ - /* Check that preempt_count() == 0 and interrupts are enabled */ -- lwz r8,TI_PREEMPT(r9) -+check_count: - cmpwi cr0,r8,0 - bne restore - ld r0,SOFTE(r1) -@@ -1158,7 +1166,7 @@ resume_kernel: - * interrupted after loading SRR0/1. - */ - wrteei 0 --#endif /* CONFIG_PREEMPT */ -+#endif /* CONFIG_PREEMPTION */ - - restore: - /* -diff --git a/arch/powerpc/kernel/syscall_64.c b/arch/powerpc/kernel/syscall_64.c -index 310bcd768cd5..ae3212dcf562 100644 ---- a/arch/powerpc/kernel/syscall_64.c -+++ b/arch/powerpc/kernel/syscall_64.c -@@ -193,7 +193,7 @@ notrace unsigned long syscall_exit_prepare(unsigned long r3, - ti_flags = READ_ONCE(*ti_flagsp); - while (unlikely(ti_flags & (_TIF_USER_WORK_MASK & ~_TIF_RESTORE_TM))) { - local_irq_enable(); -- if (ti_flags & _TIF_NEED_RESCHED) { -+ if (ti_flags & _TIF_NEED_RESCHED_MASK) { - schedule(); - } else { - /* -@@ -277,7 +277,7 @@ notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs, unsigned - ti_flags = READ_ONCE(*ti_flagsp); - while (unlikely(ti_flags & (_TIF_USER_WORK_MASK & ~_TIF_RESTORE_TM))) { - local_irq_enable(); /* returning to user: may enable */ -- if (ti_flags & _TIF_NEED_RESCHED) { -+ if (ti_flags & _TIF_NEED_RESCHED_MASK) { - schedule(); - } else { - if (ti_flags & _TIF_SIGPENDING) -@@ -361,11 +361,15 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs, unsign - /* Returning to a kernel context with local irqs enabled. */ - WARN_ON_ONCE(!(regs->msr & MSR_EE)); - again: -- if (IS_ENABLED(CONFIG_PREEMPT)) { -+ if (IS_ENABLED(CONFIG_PREEMPTION)) { - /* Return to preemptible kernel context */ - if (unlikely(*ti_flagsp & _TIF_NEED_RESCHED)) { - if (preempt_count() == 0) - preempt_schedule_irq(); -+ } else if (unlikely(*ti_flagsp & _TIF_NEED_RESCHED_LAZY)) { -+ if ((preempt_count() == 0) && -+ (current_thread_info()->preempt_lazy_count == 0)) -+ preempt_schedule_irq(); - } - } - --- -2.30.2 - diff --git a/debian/patches-rt/0272-leds-trigger-disable-CPU-trigger-on-RT.patch b/debian/patches-rt/0272-leds-trigger-disable-CPU-trigger-on-RT.patch deleted file mode 100644 index 0fecd59f4..000000000 --- a/debian/patches-rt/0272-leds-trigger-disable-CPU-trigger-on-RT.patch +++ /dev/null @@ -1,41 +0,0 @@ -From bd326bc45d2c51cdc85ecbf80c87e28fd2288f0b Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Thu, 23 Jan 2014 14:45:59 +0100 -Subject: [PATCH 272/296] leds: trigger: disable CPU trigger on -RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -as it triggers: -|CPU: 0 PID: 0 Comm: swapper Not tainted 3.12.8-rt10 #141 -|[<c0014aa4>] (unwind_backtrace+0x0/0xf8) from [<c0012788>] (show_stack+0x1c/0x20) -|[<c0012788>] (show_stack+0x1c/0x20) from [<c043c8dc>] (dump_stack+0x20/0x2c) -|[<c043c8dc>] (dump_stack+0x20/0x2c) from [<c004c5e8>] (__might_sleep+0x13c/0x170) -|[<c004c5e8>] (__might_sleep+0x13c/0x170) from [<c043f270>] (__rt_spin_lock+0x28/0x38) -|[<c043f270>] (__rt_spin_lock+0x28/0x38) from [<c043fa00>] (rt_read_lock+0x68/0x7c) -|[<c043fa00>] (rt_read_lock+0x68/0x7c) from [<c036cf74>] (led_trigger_event+0x2c/0x5c) -|[<c036cf74>] (led_trigger_event+0x2c/0x5c) from [<c036e0bc>] (ledtrig_cpu+0x54/0x5c) -|[<c036e0bc>] (ledtrig_cpu+0x54/0x5c) from [<c000ffd8>] (arch_cpu_idle_exit+0x18/0x1c) -|[<c000ffd8>] (arch_cpu_idle_exit+0x18/0x1c) from [<c00590b8>] (cpu_startup_entry+0xa8/0x234) -|[<c00590b8>] (cpu_startup_entry+0xa8/0x234) from [<c043b2cc>] (rest_init+0xb8/0xe0) -|[<c043b2cc>] (rest_init+0xb8/0xe0) from [<c061ebe0>] (start_kernel+0x2c4/0x380) - - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - drivers/leds/trigger/Kconfig | 1 + - 1 file changed, 1 insertion(+) - -diff --git a/drivers/leds/trigger/Kconfig b/drivers/leds/trigger/Kconfig -index ce9429ca6dde..29ccbd6acf43 100644 ---- a/drivers/leds/trigger/Kconfig -+++ b/drivers/leds/trigger/Kconfig -@@ -64,6 +64,7 @@ config LEDS_TRIGGER_BACKLIGHT - - config LEDS_TRIGGER_CPU - bool "LED CPU Trigger" -+ depends on !PREEMPT_RT - help - This allows LEDs to be controlled by active CPUs. This shows - the active CPUs across an array of LEDs so you can see which --- -2.30.2 - diff --git a/debian/patches-rt/0278-arm64-fpsimd-Delay-freeing-memory-in-fpsimd_flush_th.patch b/debian/patches-rt/0278-arm64-fpsimd-Delay-freeing-memory-in-fpsimd_flush_th.patch deleted file mode 100644 index d55055d4c..000000000 --- a/debian/patches-rt/0278-arm64-fpsimd-Delay-freeing-memory-in-fpsimd_flush_th.patch +++ /dev/null @@ -1,66 +0,0 @@ -From 9dd8053b06e5dd82835dc464918bf9b6669987c0 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Wed, 25 Jul 2018 14:02:38 +0200 -Subject: [PATCH 278/296] arm64: fpsimd: Delay freeing memory in - fpsimd_flush_thread() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -fpsimd_flush_thread() invokes kfree() via sve_free() within a preempt disabled -section which is not working on -RT. - -Delay freeing of memory until preemption is enabled again. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/arm64/kernel/fpsimd.c | 14 +++++++++++++- - 1 file changed, 13 insertions(+), 1 deletion(-) - -diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c -index 062b21f30f94..0ea2df6554e5 100644 ---- a/arch/arm64/kernel/fpsimd.c -+++ b/arch/arm64/kernel/fpsimd.c -@@ -226,6 +226,16 @@ static void sve_free(struct task_struct *task) - __sve_free(task); - } - -+static void *sve_free_atomic(struct task_struct *task) -+{ -+ void *sve_state = task->thread.sve_state; -+ -+ WARN_ON(test_tsk_thread_flag(task, TIF_SVE)); -+ -+ task->thread.sve_state = NULL; -+ return sve_state; -+} -+ - /* - * TIF_SVE controls whether a task can use SVE without trapping while - * in userspace, and also the way a task's FPSIMD/SVE state is stored -@@ -1022,6 +1032,7 @@ void fpsimd_thread_switch(struct task_struct *next) - void fpsimd_flush_thread(void) - { - int vl, supported_vl; -+ void *mem = NULL; - - if (!system_supports_fpsimd()) - return; -@@ -1034,7 +1045,7 @@ void fpsimd_flush_thread(void) - - if (system_supports_sve()) { - clear_thread_flag(TIF_SVE); -- sve_free(current); -+ mem = sve_free_atomic(current); - - /* - * Reset the task vector length as required. -@@ -1068,6 +1079,7 @@ void fpsimd_flush_thread(void) - } - - put_cpu_fpsimd_context(); -+ kfree(mem); - } - - /* --- -2.30.2 - diff --git a/debian/patches-rt/0286-powerpc-Avoid-recursive-header-includes.patch b/debian/patches-rt/0286-powerpc-Avoid-recursive-header-includes.patch deleted file mode 100644 index fb8e4e528..000000000 --- a/debian/patches-rt/0286-powerpc-Avoid-recursive-header-includes.patch +++ /dev/null @@ -1,48 +0,0 @@ -From 7fe218c9dc40d20cedeec766061c6fe7b363bd7a Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Fri, 8 Jan 2021 19:48:21 +0100 -Subject: [PATCH 286/296] powerpc: Avoid recursive header includes -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -- The include of bug.h leads to an include of printk.h which gets back - to spinlock.h and complains then about missing xchg(). - Remove bug.h and add bits.h which is needed for BITS_PER_BYTE. - -- Avoid the "please don't include this file directly" error from - rwlock-rt. Allow an include from/with rtmutex.h. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/powerpc/include/asm/cmpxchg.h | 2 +- - arch/powerpc/include/asm/simple_spinlock_types.h | 2 +- - 2 files changed, 2 insertions(+), 2 deletions(-) - -diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h -index cf091c4c22e5..7371f7e23c35 100644 ---- a/arch/powerpc/include/asm/cmpxchg.h -+++ b/arch/powerpc/include/asm/cmpxchg.h -@@ -5,7 +5,7 @@ - #ifdef __KERNEL__ - #include <linux/compiler.h> - #include <asm/synch.h> --#include <linux/bug.h> -+#include <linux/bits.h> - - #ifdef __BIG_ENDIAN - #define BITOFF_CAL(size, off) ((sizeof(u32) - size - off) * BITS_PER_BYTE) -diff --git a/arch/powerpc/include/asm/simple_spinlock_types.h b/arch/powerpc/include/asm/simple_spinlock_types.h -index 0f3cdd8faa95..d45561e9e6ba 100644 ---- a/arch/powerpc/include/asm/simple_spinlock_types.h -+++ b/arch/powerpc/include/asm/simple_spinlock_types.h -@@ -2,7 +2,7 @@ - #ifndef _ASM_POWERPC_SIMPLE_SPINLOCK_TYPES_H - #define _ASM_POWERPC_SIMPLE_SPINLOCK_TYPES_H - --#ifndef __LINUX_SPINLOCK_TYPES_H -+#if !defined(__LINUX_SPINLOCK_TYPES_H) && !defined(__LINUX_RT_MUTEX_H) - # error "please don't include this file directly" - #endif - --- -2.30.2 - diff --git a/debian/patches-rt/0287-POWERPC-Allow-to-enable-RT.patch b/debian/patches-rt/0287-POWERPC-Allow-to-enable-RT.patch deleted file mode 100644 index e209d8a98..000000000 --- a/debian/patches-rt/0287-POWERPC-Allow-to-enable-RT.patch +++ /dev/null @@ -1,36 +0,0 @@ -From 035a8354a71d8d8ee9a5f4c454ce89ba2386e926 Mon Sep 17 00:00:00 2001 -From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Fri, 11 Oct 2019 13:14:41 +0200 -Subject: [PATCH 287/296] POWERPC: Allow to enable RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Allow to select RT. - -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - arch/powerpc/Kconfig | 2 ++ - 1 file changed, 2 insertions(+) - -diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig -index 155add269fc4..71529672b738 100644 ---- a/arch/powerpc/Kconfig -+++ b/arch/powerpc/Kconfig -@@ -146,6 +146,7 @@ config PPC - select ARCH_MIGHT_HAVE_PC_SERIO - select ARCH_OPTIONAL_KERNEL_RWX if ARCH_HAS_STRICT_KERNEL_RWX - select ARCH_SUPPORTS_ATOMIC_RMW -+ select ARCH_SUPPORTS_RT if HAVE_POSIX_CPU_TIMERS_TASK_WORK - select ARCH_USE_BUILTIN_BSWAP - select ARCH_USE_CMPXCHG_LOCKREF if PPC64 - select ARCH_USE_QUEUED_RWLOCKS if PPC_QUEUED_SPINLOCKS -@@ -238,6 +239,7 @@ config PPC - select HAVE_SYSCALL_TRACEPOINTS - select HAVE_VIRT_CPU_ACCOUNTING - select HAVE_IRQ_TIME_ACCOUNTING -+ select HAVE_POSIX_CPU_TIMERS_TASK_WORK if !KVM - select HAVE_RSEQ - select IOMMU_HELPER if PPC64 - select IRQ_DOMAIN --- -2.30.2 - diff --git a/debian/patches-rt/0290-signals-Allow-rt-tasks-to-cache-one-sigqueue-struct.patch b/debian/patches-rt/0290-signals-Allow-rt-tasks-to-cache-one-sigqueue-struct.patch deleted file mode 100644 index e69c0ea63..000000000 --- a/debian/patches-rt/0290-signals-Allow-rt-tasks-to-cache-one-sigqueue-struct.patch +++ /dev/null @@ -1,212 +0,0 @@ -From b5c15e8a9b4b9a6923dac9e886f827a7630a8852 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Fri, 3 Jul 2009 08:44:56 -0500 -Subject: [PATCH 290/296] signals: Allow rt tasks to cache one sigqueue struct -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -To avoid allocation allow rt tasks to cache one sigqueue struct in -task struct. - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - include/linux/sched.h | 1 + - include/linux/signal.h | 1 + - kernel/exit.c | 2 +- - kernel/fork.c | 1 + - kernel/signal.c | 69 +++++++++++++++++++++++++++++++++++++++--- - 5 files changed, 69 insertions(+), 5 deletions(-) - -diff --git a/include/linux/sched.h b/include/linux/sched.h -index 28509a37eb71..e688e9307a21 100644 ---- a/include/linux/sched.h -+++ b/include/linux/sched.h -@@ -985,6 +985,7 @@ struct task_struct { - /* Signal handlers: */ - struct signal_struct *signal; - struct sighand_struct __rcu *sighand; -+ struct sigqueue *sigqueue_cache; - sigset_t blocked; - sigset_t real_blocked; - /* Restored if set_restore_sigmask() was used: */ -diff --git a/include/linux/signal.h b/include/linux/signal.h -index b256f9c65661..ebf6c515a7b2 100644 ---- a/include/linux/signal.h -+++ b/include/linux/signal.h -@@ -265,6 +265,7 @@ static inline void init_sigpending(struct sigpending *sig) - } - - extern void flush_sigqueue(struct sigpending *queue); -+extern void flush_task_sigqueue(struct task_struct *tsk); - - /* Test if 'sig' is valid signal. Use this instead of testing _NSIG directly */ - static inline int valid_signal(unsigned long sig) -diff --git a/kernel/exit.c b/kernel/exit.c -index d13d67fc5f4e..f5933bd07932 100644 ---- a/kernel/exit.c -+++ b/kernel/exit.c -@@ -152,7 +152,7 @@ static void __exit_signal(struct task_struct *tsk) - * Do this under ->siglock, we can race with another thread - * doing sigqueue_free() if we have SIGQUEUE_PREALLOC signals. - */ -- flush_sigqueue(&tsk->pending); -+ flush_task_sigqueue(tsk); - tsk->sighand = NULL; - spin_unlock(&sighand->siglock); - -diff --git a/kernel/fork.c b/kernel/fork.c -index eaa47c928f96..1e97f271ac59 100644 ---- a/kernel/fork.c -+++ b/kernel/fork.c -@@ -2025,6 +2025,7 @@ static __latent_entropy struct task_struct *copy_process( - spin_lock_init(&p->alloc_lock); - - init_sigpending(&p->pending); -+ p->sigqueue_cache = NULL; - - p->utime = p->stime = p->gtime = 0; - #ifdef CONFIG_ARCH_HAS_SCALED_CPUTIME -diff --git a/kernel/signal.c b/kernel/signal.c -index fde69283d03f..bad7873f8949 100644 ---- a/kernel/signal.c -+++ b/kernel/signal.c -@@ -20,6 +20,7 @@ - #include <linux/sched/task.h> - #include <linux/sched/task_stack.h> - #include <linux/sched/cputime.h> -+#include <linux/sched/rt.h> - #include <linux/file.h> - #include <linux/fs.h> - #include <linux/proc_fs.h> -@@ -404,13 +405,30 @@ void task_join_group_stop(struct task_struct *task) - task_set_jobctl_pending(task, mask | JOBCTL_STOP_PENDING); - } - -+static inline struct sigqueue *get_task_cache(struct task_struct *t) -+{ -+ struct sigqueue *q = t->sigqueue_cache; -+ -+ if (cmpxchg(&t->sigqueue_cache, q, NULL) != q) -+ return NULL; -+ return q; -+} -+ -+static inline int put_task_cache(struct task_struct *t, struct sigqueue *q) -+{ -+ if (cmpxchg(&t->sigqueue_cache, NULL, q) == NULL) -+ return 0; -+ return 1; -+} -+ - /* - * allocate a new signal queue record - * - this may be called without locks if and only if t == current, otherwise an - * appropriate lock must be held to stop the target task from exiting - */ - static struct sigqueue * --__sigqueue_alloc(int sig, struct task_struct *t, gfp_t flags, int override_rlimit) -+__sigqueue_do_alloc(int sig, struct task_struct *t, gfp_t flags, -+ int override_rlimit, int fromslab) - { - struct sigqueue *q = NULL; - struct user_struct *user; -@@ -432,7 +450,10 @@ __sigqueue_alloc(int sig, struct task_struct *t, gfp_t flags, int override_rlimi - rcu_read_unlock(); - - if (override_rlimit || likely(sigpending <= task_rlimit(t, RLIMIT_SIGPENDING))) { -- q = kmem_cache_alloc(sigqueue_cachep, flags); -+ if (!fromslab) -+ q = get_task_cache(t); -+ if (!q) -+ q = kmem_cache_alloc(sigqueue_cachep, flags); - } else { - print_dropped_signal(sig); - } -@@ -449,6 +470,13 @@ __sigqueue_alloc(int sig, struct task_struct *t, gfp_t flags, int override_rlimi - return q; - } - -+static struct sigqueue * -+__sigqueue_alloc(int sig, struct task_struct *t, gfp_t flags, -+ int override_rlimit) -+{ -+ return __sigqueue_do_alloc(sig, t, flags, override_rlimit, 0); -+} -+ - static void __sigqueue_free(struct sigqueue *q) - { - if (q->flags & SIGQUEUE_PREALLOC) -@@ -458,6 +486,21 @@ static void __sigqueue_free(struct sigqueue *q) - kmem_cache_free(sigqueue_cachep, q); - } - -+static void sigqueue_free_current(struct sigqueue *q) -+{ -+ struct user_struct *up; -+ -+ if (q->flags & SIGQUEUE_PREALLOC) -+ return; -+ -+ up = q->user; -+ if (rt_prio(current->normal_prio) && !put_task_cache(current, q)) { -+ atomic_dec(&up->sigpending); -+ free_uid(up); -+ } else -+ __sigqueue_free(q); -+} -+ - void flush_sigqueue(struct sigpending *queue) - { - struct sigqueue *q; -@@ -470,6 +513,21 @@ void flush_sigqueue(struct sigpending *queue) - } - } - -+/* -+ * Called from __exit_signal. Flush tsk->pending and -+ * tsk->sigqueue_cache -+ */ -+void flush_task_sigqueue(struct task_struct *tsk) -+{ -+ struct sigqueue *q; -+ -+ flush_sigqueue(&tsk->pending); -+ -+ q = get_task_cache(tsk); -+ if (q) -+ kmem_cache_free(sigqueue_cachep, q); -+} -+ - /* - * Flush all pending signals for this kthread. - */ -@@ -594,7 +652,7 @@ static void collect_signal(int sig, struct sigpending *list, kernel_siginfo_t *i - (info->si_code == SI_TIMER) && - (info->si_sys_private); - -- __sigqueue_free(first); -+ sigqueue_free_current(first); - } else { - /* - * Ok, it wasn't in the queue. This must be -@@ -631,6 +689,8 @@ int dequeue_signal(struct task_struct *tsk, sigset_t *mask, kernel_siginfo_t *in - bool resched_timer = false; - int signr; - -+ WARN_ON_ONCE(tsk != current); -+ - /* We only dequeue private signals from ourselves, we don't let - * signalfd steal them - */ -@@ -1835,7 +1895,8 @@ EXPORT_SYMBOL(kill_pid); - */ - struct sigqueue *sigqueue_alloc(void) - { -- struct sigqueue *q = __sigqueue_alloc(-1, current, GFP_KERNEL, 0); -+ /* Preallocated sigqueue objects always from the slabcache ! */ -+ struct sigqueue *q = __sigqueue_do_alloc(-1, current, GFP_KERNEL, 0, 1); - - if (q) - q->flags |= SIGQUEUE_PREALLOC; --- -2.30.2 - diff --git a/debian/patches-rt/0291-signal-Prevent-double-free-of-user-struct.patch b/debian/patches-rt/0291-signal-Prevent-double-free-of-user-struct.patch deleted file mode 100644 index b1126edd9..000000000 --- a/debian/patches-rt/0291-signal-Prevent-double-free-of-user-struct.patch +++ /dev/null @@ -1,52 +0,0 @@ -From a8bcbfba872add3e24d04edfc9a0821dda254e5f Mon Sep 17 00:00:00 2001 -From: Matt Fleming <matt@codeblueprint.co.uk> -Date: Tue, 7 Apr 2020 10:54:13 +0100 -Subject: [PATCH 291/296] signal: Prevent double-free of user struct -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -The way user struct reference counting works changed significantly with, - - fda31c50292a ("signal: avoid double atomic counter increments for user accounting") - -Now user structs are only freed once the last pending signal is -dequeued. Make sigqueue_free_current() follow this new convention to -avoid freeing the user struct multiple times and triggering this -warning: - - refcount_t: underflow; use-after-free. - WARNING: CPU: 0 PID: 6794 at lib/refcount.c:288 refcount_dec_not_one+0x45/0x50 - Call Trace: - refcount_dec_and_lock_irqsave+0x16/0x60 - free_uid+0x31/0xa0 - __dequeue_signal+0x17c/0x190 - dequeue_signal+0x5a/0x1b0 - do_sigtimedwait+0x208/0x250 - __x64_sys_rt_sigtimedwait+0x6f/0xd0 - do_syscall_64+0x72/0x200 - entry_SYSCALL_64_after_hwframe+0x49/0xbe - -Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> -Reported-by: Daniel Wagner <wagi@monom.org> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - kernel/signal.c | 4 ++-- - 1 file changed, 2 insertions(+), 2 deletions(-) - -diff --git a/kernel/signal.c b/kernel/signal.c -index bad7873f8949..0be3c40c5662 100644 ---- a/kernel/signal.c -+++ b/kernel/signal.c -@@ -495,8 +495,8 @@ static void sigqueue_free_current(struct sigqueue *q) - - up = q->user; - if (rt_prio(current->normal_prio) && !put_task_cache(current, q)) { -- atomic_dec(&up->sigpending); -- free_uid(up); -+ if (atomic_dec_and_test(&up->sigpending)) -+ free_uid(up); - } else - __sigqueue_free(q); - } --- -2.30.2 - diff --git a/debian/patches-rt/0292-genirq-Disable-irqpoll-on-rt.patch b/debian/patches-rt/0292-genirq-Disable-irqpoll-on-rt.patch deleted file mode 100644 index 796c3704a..000000000 --- a/debian/patches-rt/0292-genirq-Disable-irqpoll-on-rt.patch +++ /dev/null @@ -1,43 +0,0 @@ -From dfcff2b5e4f129486f90d645d531f167feabe5ca Mon Sep 17 00:00:00 2001 -From: Ingo Molnar <mingo@elte.hu> -Date: Fri, 3 Jul 2009 08:29:57 -0500 -Subject: [PATCH 292/296] genirq: Disable irqpoll on -rt -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Creates long latencies for no value - -Signed-off-by: Ingo Molnar <mingo@elte.hu> -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - kernel/irq/spurious.c | 8 ++++++++ - 1 file changed, 8 insertions(+) - -diff --git a/kernel/irq/spurious.c b/kernel/irq/spurious.c -index f865e5f4d382..dc7311dd74b1 100644 ---- a/kernel/irq/spurious.c -+++ b/kernel/irq/spurious.c -@@ -443,6 +443,10 @@ MODULE_PARM_DESC(noirqdebug, "Disable irq lockup detection when true"); - - static int __init irqfixup_setup(char *str) - { -+#ifdef CONFIG_PREEMPT_RT -+ pr_warn("irqfixup boot option not supported w/ CONFIG_PREEMPT_RT\n"); -+ return 1; -+#endif - irqfixup = 1; - printk(KERN_WARNING "Misrouted IRQ fixup support enabled.\n"); - printk(KERN_WARNING "This may impact system performance.\n"); -@@ -455,6 +459,10 @@ module_param(irqfixup, int, 0644); - - static int __init irqpoll_setup(char *str) - { -+#ifdef CONFIG_PREEMPT_RT -+ pr_warn("irqpoll boot option not supported w/ CONFIG_PREEMPT_RT\n"); -+ return 1; -+#endif - irqfixup = 2; - printk(KERN_WARNING "Misrouted IRQ fixup and polling support " - "enabled\n"); --- -2.30.2 - diff --git a/debian/patches-rt/0294-Add-localversion-for-RT-release.patch b/debian/patches-rt/0294-Add-localversion-for-RT-release.patch deleted file mode 100644 index 0b7845620..000000000 --- a/debian/patches-rt/0294-Add-localversion-for-RT-release.patch +++ /dev/null @@ -1,22 +0,0 @@ -From d2e8ab7f33a3994e5ea24554f933ca1bbfdb9b89 Mon Sep 17 00:00:00 2001 -From: Thomas Gleixner <tglx@linutronix.de> -Date: Fri, 8 Jul 2011 20:25:16 +0200 -Subject: [PATCH 294/296] Add localversion for -RT release -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - localversion-rt | 1 + - 1 file changed, 1 insertion(+) - create mode 100644 localversion-rt - -diff --git a/localversion-rt b/localversion-rt -new file mode 100644 -index 000000000000..21988f9ad53f ---- /dev/null -+++ b/localversion-rt -@@ -0,0 +1 @@ -+-rt34 --- -2.30.2 - diff --git a/debian/patches-rt/0295-net-xfrm-Use-sequence-counter-with-associated-spinlo.patch b/debian/patches-rt/0295-net-xfrm-Use-sequence-counter-with-associated-spinlo.patch deleted file mode 100644 index 19c98b843..000000000 --- a/debian/patches-rt/0295-net-xfrm-Use-sequence-counter-with-associated-spinlo.patch +++ /dev/null @@ -1,43 +0,0 @@ -From 52599b48cb964ab162adb0a7467b5b7a4aa5e756 Mon Sep 17 00:00:00 2001 -From: "Ahmed S. Darwish" <a.darwish@linutronix.de> -Date: Tue, 16 Mar 2021 11:56:30 +0100 -Subject: [PATCH 295/296] net: xfrm: Use sequence counter with associated - spinlock -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - -A sequence counter write section must be serialized or its internal -state can get corrupted. A plain seqcount_t does not contain the -information of which lock must be held to guaranteee write side -serialization. - -For xfrm_state_hash_generation, use seqcount_spinlock_t instead of plain -seqcount_t. This allows to associate the spinlock used for write -serialization with the sequence counter. It thus enables lockdep to -verify that the write serialization lock is indeed held before entering -the sequence counter write section. - -If lockdep is disabled, this lock association is compiled out and has -neither storage size nor runtime overhead. - -Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> -Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> ---- - include/net/netns/xfrm.h | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/include/net/netns/xfrm.h b/include/net/netns/xfrm.h -index b59d73d529ba..e816b6a3ef2b 100644 ---- a/include/net/netns/xfrm.h -+++ b/include/net/netns/xfrm.h -@@ -73,7 +73,7 @@ struct netns_xfrm { - struct dst_ops xfrm6_dst_ops; - #endif - spinlock_t xfrm_state_lock; -- seqcount_t xfrm_state_hash_generation; -+ seqcount_spinlock_t xfrm_state_hash_generation; - - spinlock_t xfrm_policy_lock; - struct mutex xfrm_cfg_mutex; --- -2.30.2 - diff --git a/debian/patches-rt/0296-Linux-5.10.35-rt39-REBASE.patch b/debian/patches-rt/0296-Linux-5.10.35-rt39-REBASE.patch deleted file mode 100644 index 1d95dafef..000000000 --- a/debian/patches-rt/0296-Linux-5.10.35-rt39-REBASE.patch +++ /dev/null @@ -1,20 +0,0 @@ -From 22c8243be7c22f5ea01f96a52d17b7e9c26c4851 Mon Sep 17 00:00:00 2001 -From: "Steven Rostedt (VMware)" <rostedt@goodmis.org> -Date: Fri, 7 May 2021 17:46:01 -0400 -Subject: [PATCH 296/296] Linux 5.10.35-rt39 REBASE -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz - ---- - localversion-rt | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/localversion-rt b/localversion-rt -index 21988f9ad53f..5498386d0d0c 100644 ---- a/localversion-rt -+++ b/localversion-rt -@@ -1 +1 @@ ---rt34 -+-rt39 --- -2.30.2 - diff --git a/debian/patches-rt/0281-ARM64-Allow-to-enable-RT.patch b/debian/patches-rt/ARM64__Allow_to_enable_RT.patch index dd176a8d5..5f2a9a18d 100644 --- a/debian/patches-rt/0281-ARM64-Allow-to-enable-RT.patch +++ b/debian/patches-rt/ARM64__Allow_to_enable_RT.patch @@ -1,36 +1,35 @@ -From d331f8c79fe6faacb5fc673c9839c2f1cf8a97e0 Mon Sep 17 00:00:00 2001 +Subject: ARM64: Allow to enable RT +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Fri Oct 11 13:14:35 2019 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Fri, 11 Oct 2019 13:14:35 +0200 -Subject: [PATCH 281/296] ARM64: Allow to enable RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz Allow to select RT. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - arch/arm64/Kconfig | 2 ++ + arch/arm64/Kconfig | 2 ++ 1 file changed, 2 insertions(+) - -diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig -index 048f50fbcf89..a2514d685485 100644 +--- --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig -@@ -76,6 +76,7 @@ config ARM64 +@@ -88,6 +88,7 @@ config ARM64 select ARCH_SUPPORTS_ATOMIC_RMW - select ARCH_SUPPORTS_INT128 if CC_HAS_INT128 && (GCC_VERSION >= 50000 || CC_IS_CLANG) + select ARCH_SUPPORTS_INT128 if CC_HAS_INT128 select ARCH_SUPPORTS_NUMA_BALANCING + select ARCH_SUPPORTS_RT if HAVE_POSIX_CPU_TIMERS_TASK_WORK select ARCH_WANT_COMPAT_IPC_PARSE_VERSION if COMPAT select ARCH_WANT_DEFAULT_BPF_JIT select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT -@@ -195,6 +196,7 @@ config ARM64 +@@ -214,6 +215,7 @@ config ARM64 select PCI_DOMAINS_GENERIC if PCI select PCI_ECAM if (ACPI && PCI) select PCI_SYSCALL if PCI + select HAVE_POSIX_CPU_TIMERS_TASK_WORK if !KVM select POWER_RESET select POWER_SUPPLY - select SET_FS --- -2.30.2 - + select SPARSE_IRQ diff --git a/debian/patches-rt/0280-ARM-Allow-to-enable-RT.patch b/debian/patches-rt/ARM__Allow_to_enable_RT.patch index 58d6b8cb0..014e56ce9 100644 --- a/debian/patches-rt/0280-ARM-Allow-to-enable-RT.patch +++ b/debian/patches-rt/ARM__Allow_to_enable_RT.patch @@ -1,36 +1,35 @@ -From 71bbc49a464cf9b0a952e8d68887bb8c1a794463 Mon Sep 17 00:00:00 2001 +Subject: ARM: Allow to enable RT +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Fri Oct 11 13:14:29 2019 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Fri, 11 Oct 2019 13:14:29 +0200 -Subject: [PATCH 280/296] ARM: Allow to enable RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz Allow to select RT. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - arch/arm/Kconfig | 2 ++ + arch/arm/Kconfig | 2 ++ 1 file changed, 2 insertions(+) - -diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig -index 07f43a0e8189..8a7ad1ef6122 100644 +--- --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig -@@ -31,6 +31,7 @@ config ARM - select ARCH_OPTIONAL_KERNEL_RWX if ARCH_HAS_STRICT_KERNEL_RWX +@@ -32,6 +32,7 @@ config ARM select ARCH_OPTIONAL_KERNEL_RWX_DEFAULT if CPU_V7 select ARCH_SUPPORTS_ATOMIC_RMW + select ARCH_SUPPORTS_HUGETLBFS if ARM_LPAE + select ARCH_SUPPORTS_RT if HAVE_POSIX_CPU_TIMERS_TASK_WORK select ARCH_USE_BUILTIN_BSWAP select ARCH_USE_CMPXCHG_LOCKREF - select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU -@@ -121,6 +122,7 @@ config ARM + select ARCH_USE_MEMTEST +@@ -125,6 +126,7 @@ config ARM select OLD_SIGSUSPEND3 select PCI_SYSCALL if PCI select PERF_USE_VMALLOC + select HAVE_POSIX_CPU_TIMERS_TASK_WORK if !KVM select RTC_LIB - select SET_FS select SYS_SUPPORTS_APM_EMULATION --- -2.30.2 - + select TRACE_IRQFLAGS_SUPPORT if !CPU_V7M diff --git a/debian/patches-rt/0275-ARM-enable-irq-in-translation-section-permission-fau.patch b/debian/patches-rt/ARM__enable_irq_in_translation_section_permission_fault_handlers.patch index c7f4d1a0b..433f5b950 100644 --- a/debian/patches-rt/0275-ARM-enable-irq-in-translation-section-permission-fau.patch +++ b/debian/patches-rt/ARM__enable_irq_in_translation_section_permission_fault_handlers.patch @@ -1,12 +1,9 @@ -From 946727ba2383dce6544138e73e9eab353a25e3a2 Mon Sep 17 00:00:00 2001 -From: "Yadi.hu" <yadi.hu@windriver.com> -Date: Wed, 10 Dec 2014 10:32:09 +0800 -Subject: [PATCH 275/296] ARM: enable irq in translation/section permission - fault handlers -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz +Subject: ARM: enable irq in translation/section permission fault handlers +From: Yadi.hu <yadi.hu@windriver.com> +Date: Wed Dec 10 10:32:09 2014 +0800 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +From: Yadi.hu <yadi.hu@windriver.com> Probably happens on all ARM, with CONFIG_PREEMPT_RT @@ -63,15 +60,16 @@ permission exception. Signed-off-by: Yadi.hu <yadi.hu@windriver.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - arch/arm/mm/fault.c | 6 ++++++ + arch/arm/mm/fault.c | 6 ++++++ 1 file changed, 6 insertions(+) - -diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c -index efa402025031..59487ee9fd61 100644 +--- --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c -@@ -400,6 +400,9 @@ do_translation_fault(unsigned long addr, unsigned int fsr, +@@ -400,6 +400,9 @@ do_translation_fault(unsigned long addr, if (addr < TASK_SIZE) return do_page_fault(addr, fsr, regs); @@ -81,7 +79,7 @@ index efa402025031..59487ee9fd61 100644 if (user_mode(regs)) goto bad_area; -@@ -470,6 +473,9 @@ do_translation_fault(unsigned long addr, unsigned int fsr, +@@ -470,6 +473,9 @@ do_translation_fault(unsigned long addr, static int do_sect_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs) { @@ -91,6 +89,3 @@ index efa402025031..59487ee9fd61 100644 do_bad_area(addr, fsr, regs); return 0; } --- -2.30.2 - diff --git a/debian/patches-rt/ASoC-mediatek-mt8195-Remove-unsued-irqs_lock.patch b/debian/patches-rt/ASoC-mediatek-mt8195-Remove-unsued-irqs_lock.patch new file mode 100644 index 000000000..4ab40ee8e --- /dev/null +++ b/debian/patches-rt/ASoC-mediatek-mt8195-Remove-unsued-irqs_lock.patch @@ -0,0 +1,30 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Thu, 9 Sep 2021 10:15:30 +0200 +Subject: [PATCH] ASoC: mediatek: mt8195: Remove unsued irqs_lock. +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +irqs_lock is not used, never was. + +Remove irqs_lock. + +Fixes: 283b612429a27 ("ASoC: mediatek: implement mediatek common structure") +Cc: Liam Girdwood <lgirdwood@gmail.com> +Cc: Mark Brown <broonie@kernel.org> +Cc: Jaroslav Kysela <perex@perex.cz> +Cc: Takashi Iwai <tiwai@suse.com> +Cc: Matthias Brugger <matthias.bgg@gmail.com> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + sound/soc/mediatek/common/mtk-afe-fe-dai.c | 1 - + 1 file changed, 1 deletion(-) + +--- a/sound/soc/mediatek/common/mtk-afe-fe-dai.c ++++ b/sound/soc/mediatek/common/mtk-afe-fe-dai.c +@@ -288,7 +288,6 @@ const struct snd_soc_dai_ops mtk_afe_fe_ + }; + EXPORT_SYMBOL_GPL(mtk_afe_fe_ops); + +-static DEFINE_MUTEX(irqs_lock); + int mtk_dynamic_irq_acquire(struct mtk_base_afe *afe) + { + int i; diff --git a/debian/patches-rt/Add_localversion_for_-RT_release.patch b/debian/patches-rt/Add_localversion_for_-RT_release.patch new file mode 100644 index 000000000..6b82e1327 --- /dev/null +++ b/debian/patches-rt/Add_localversion_for_-RT_release.patch @@ -0,0 +1,19 @@ +Subject: Add localversion for -RT release +From: Thomas Gleixner <tglx@linutronix.de> +Date: Fri Jul 8 20:25:16 2011 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +From: Thomas Gleixner <tglx@linutronix.de> + +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + +--- + localversion-rt | 1 + + 1 file changed, 1 insertion(+) + create mode 100644 localversion-rt +--- +--- /dev/null ++++ b/localversion-rt +@@ -0,0 +1 @@ ++-rt21 diff --git a/debian/patches-rt/0277-KVM-arm-arm64-downgrade-preempt_disable-d-region-to-.patch b/debian/patches-rt/KVM__arm_arm64__downgrade_preempt_disabled_region_to_migrate_disable.patch index fb9b7a117..523e1643f 100644 --- a/debian/patches-rt/0277-KVM-arm-arm64-downgrade-preempt_disable-d-region-to-.patch +++ b/debian/patches-rt/KVM__arm_arm64__downgrade_preempt_disabled_region_to_migrate_disable.patch @@ -1,9 +1,9 @@ -From 208fcd6b96ecec18373bc9047d6006f9c4080a04 Mon Sep 17 00:00:00 2001 +Subject: KVM: arm/arm64: downgrade preempt_disable()d region to migrate_disable() +From: Josh Cartwright <joshc@ni.com> +Date: Thu Feb 11 11:54:01 2016 -0600 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Josh Cartwright <joshc@ni.com> -Date: Thu, 11 Feb 2016 11:54:01 -0600 -Subject: [PATCH 277/296] KVM: arm/arm64: downgrade preempt_disable()d region - to migrate_disable() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz kvm_arch_vcpu_ioctl_run() disables the use of preemption when updating the vgic and timer states to prevent the calling task from migrating to @@ -19,15 +19,16 @@ Cc: Christoffer Dall <christoffer.dall@linaro.org> Reported-by: Manish Jaggi <Manish.Jaggi@caviumnetworks.com> Signed-off-by: Josh Cartwright <joshc@ni.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - arch/arm64/kvm/arm.c | 6 +++--- + arch/arm64/kvm/arm.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) - -diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c -index a1c2c955474e..df1d0d1511d1 100644 +--- --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c -@@ -706,7 +706,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) +@@ -811,7 +811,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_v * involves poking the GIC, which must be done in a * non-preemptible context. */ @@ -36,7 +37,7 @@ index a1c2c955474e..df1d0d1511d1 100644 kvm_pmu_flush_hwstate(vcpu); -@@ -755,7 +755,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) +@@ -835,7 +835,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_v kvm_timer_sync_user(vcpu); kvm_vgic_sync_hwstate(vcpu); local_irq_enable(); @@ -45,7 +46,7 @@ index a1c2c955474e..df1d0d1511d1 100644 continue; } -@@ -827,7 +827,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) +@@ -907,7 +907,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_v /* Exit types that need handling before we can be preempted */ handle_exit_early(vcpu, ret); @@ -54,6 +55,3 @@ index a1c2c955474e..df1d0d1511d1 100644 /* * The ARMv8 architecture doesn't give the hypervisor --- -2.30.2 - diff --git a/debian/patches-rt/POWERPC__Allow_to_enable_RT.patch b/debian/patches-rt/POWERPC__Allow_to_enable_RT.patch new file mode 100644 index 000000000..2bdb47eb6 --- /dev/null +++ b/debian/patches-rt/POWERPC__Allow_to_enable_RT.patch @@ -0,0 +1,35 @@ +Subject: POWERPC: Allow to enable RT +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Fri Oct 11 13:14:41 2019 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> + +Allow to select RT. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + +--- + arch/powerpc/Kconfig | 2 ++ + 1 file changed, 2 insertions(+) +--- +--- a/arch/powerpc/Kconfig ++++ b/arch/powerpc/Kconfig +@@ -151,6 +151,7 @@ config PPC + select ARCH_STACKWALK + select ARCH_SUPPORTS_ATOMIC_RMW + select ARCH_SUPPORTS_DEBUG_PAGEALLOC if PPC_BOOK3S || PPC_8xx || 40x ++ select ARCH_SUPPORTS_RT if HAVE_POSIX_CPU_TIMERS_TASK_WORK + select ARCH_USE_BUILTIN_BSWAP + select ARCH_USE_CMPXCHG_LOCKREF if PPC64 + select ARCH_USE_MEMTEST +@@ -219,6 +220,7 @@ config PPC + select HAVE_IOREMAP_PROT + select HAVE_IRQ_EXIT_ON_IRQ_STACK + select HAVE_IRQ_TIME_ACCOUNTING ++ select HAVE_POSIX_CPU_TIMERS_TASK_WORK if !KVM + select HAVE_KERNEL_GZIP + select HAVE_KERNEL_LZMA if DEFAULT_UIMAGE + select HAVE_KERNEL_LZO if DEFAULT_UIMAGE diff --git a/debian/patches-rt/0270-arch-arm64-Add-lazy-preempt-support.patch b/debian/patches-rt/arch_arm64__Add_lazy_preempt_support.patch index 1fdeaa13b..f2b6ccf95 100644 --- a/debian/patches-rt/0270-arch-arm64-Add-lazy-preempt-support.patch +++ b/debian/patches-rt/arch_arm64__Add_lazy_preempt_support.patch @@ -1,8 +1,9 @@ -From aa08ef4ecd16fbb2c3f21e2311f202a1f1e4bb37 Mon Sep 17 00:00:00 2001 +Subject: arch/arm64: Add lazy preempt support +From: Anders Roxell <anders.roxell@linaro.org> +Date: Thu May 14 17:52:17 2015 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Anders Roxell <anders.roxell@linaro.org> -Date: Thu, 14 May 2015 17:52:17 +0200 -Subject: [PATCH 270/296] arch/arm64: Add lazy preempt support -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz arm64 is missing support for PREEMPT_RT. The main feature which is lacking is support for lazy preemption. The arch-specific entry code, @@ -12,32 +13,30 @@ to be extended to indicate the support is available, and also to indicate that support for full RT preemption is now available. Signed-off-by: Anders Roxell <anders.roxell@linaro.org> ---- - arch/arm64/Kconfig | 1 + - arch/arm64/include/asm/preempt.h | 25 ++++++++++++++++++++++++- - arch/arm64/include/asm/thread_info.h | 7 ++++++- - arch/arm64/kernel/asm-offsets.c | 1 + - arch/arm64/kernel/entry.S | 13 +++++++++++-- - arch/arm64/kernel/signal.c | 2 +- - 6 files changed, 44 insertions(+), 5 deletions(-) +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig -index 5e5cf3af6351..048f50fbcf89 100644 + +--- + arch/arm64/Kconfig | 1 + + arch/arm64/include/asm/preempt.h | 25 ++++++++++++++++++++++++- + arch/arm64/include/asm/thread_info.h | 8 +++++++- + arch/arm64/kernel/asm-offsets.c | 1 + + arch/arm64/kernel/signal.c | 2 +- + 5 files changed, 34 insertions(+), 3 deletions(-) +--- --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig -@@ -173,6 +173,7 @@ config ARM64 - select HAVE_PERF_EVENTS +@@ -192,6 +192,7 @@ config ARM64 select HAVE_PERF_REGS select HAVE_PERF_USER_STACK_DUMP -+ select HAVE_PREEMPT_LAZY select HAVE_REGS_AND_STACK_ACCESS_API ++ select HAVE_PREEMPT_LAZY select HAVE_FUNCTION_ARG_ACCESS_API select HAVE_FUTEX_CMPXCHG if FUTEX -diff --git a/arch/arm64/include/asm/preempt.h b/arch/arm64/include/asm/preempt.h -index f06a23898540..994f997b1572 100644 + select MMU_GATHER_RCU_TABLE_FREE --- a/arch/arm64/include/asm/preempt.h +++ b/arch/arm64/include/asm/preempt.h -@@ -70,13 +70,36 @@ static inline bool __preempt_count_dec_and_test(void) +@@ -70,13 +70,36 @@ static inline bool __preempt_count_dec_a * interrupt occurring between the non-atomic READ_ONCE/WRITE_ONCE * pair. */ @@ -75,11 +74,9 @@ index f06a23898540..994f997b1572 100644 } #ifdef CONFIG_PREEMPTION -diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h -index 1fbab854a51b..148b53dc2840 100644 --- a/arch/arm64/include/asm/thread_info.h +++ b/arch/arm64/include/asm/thread_info.h -@@ -29,6 +29,7 @@ struct thread_info { +@@ -26,6 +26,7 @@ struct thread_info { #ifdef CONFIG_ARM64_SW_TTBR0_PAN u64 ttbr0; /* saved TTBR0_EL1 */ #endif @@ -87,80 +84,53 @@ index 1fbab854a51b..148b53dc2840 100644 union { u64 preempt_count; /* 0 => preemptible, <0 => bug */ struct { -@@ -68,6 +69,7 @@ void arch_release_task_struct(struct task_struct *tsk); +@@ -67,6 +68,7 @@ int arch_dup_task_struct(struct task_str #define TIF_UPROBE 4 /* uprobe breakpoint or singlestep */ - #define TIF_FSCHECK 5 /* Check FS is USER_DS on return */ - #define TIF_MTE_ASYNC_FAULT 6 /* MTE Asynchronous Tag Check Fault */ + #define TIF_MTE_ASYNC_FAULT 5 /* MTE Asynchronous Tag Check Fault */ + #define TIF_NOTIFY_SIGNAL 6 /* signal notifications exist */ +#define TIF_NEED_RESCHED_LAZY 7 #define TIF_SYSCALL_TRACE 8 /* syscall trace active */ #define TIF_SYSCALL_AUDIT 9 /* syscall auditing */ #define TIF_SYSCALL_TRACEPOINT 10 /* syscall tracepoint for ftrace */ -@@ -98,11 +100,14 @@ void arch_release_task_struct(struct task_struct *tsk); - #define _TIF_32BIT (1 << TIF_32BIT) +@@ -97,8 +99,10 @@ int arch_dup_task_struct(struct task_str #define _TIF_SVE (1 << TIF_SVE) #define _TIF_MTE_ASYNC_FAULT (1 << TIF_MTE_ASYNC_FAULT) + #define _TIF_NOTIFY_SIGNAL (1 << TIF_NOTIFY_SIGNAL) +#define _TIF_NEED_RESCHED_LAZY (1 << TIF_NEED_RESCHED_LAZY) - #define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_SIGPENDING | \ +-#define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_SIGPENDING | \ ++#define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY | \ ++ _TIF_SIGPENDING | \ _TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE | \ -- _TIF_UPROBE | _TIF_FSCHECK | _TIF_MTE_ASYNC_FAULT) -+ _TIF_UPROBE | _TIF_FSCHECK | _TIF_MTE_ASYNC_FAULT | \ -+ _TIF_NEED_RESCHED_LAZY) - -+#define _TIF_NEED_RESCHED_MASK (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY) - #define _TIF_SYSCALL_WORK (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \ + _TIF_UPROBE | _TIF_MTE_ASYNC_FAULT | \ + _TIF_NOTIFY_SIGNAL) +@@ -107,6 +111,8 @@ int arch_dup_task_struct(struct task_str _TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP | \ _TIF_SYSCALL_EMU) -diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c -index 7d32fc959b1a..b2f29bd2ae87 100644 + ++#define _TIF_NEED_RESCHED_MASK (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY) ++ + #ifdef CONFIG_SHADOW_CALL_STACK + #define INIT_SCS \ + .scs_base = init_shadow_call_stack, \ --- a/arch/arm64/kernel/asm-offsets.c +++ b/arch/arm64/kernel/asm-offsets.c -@@ -30,6 +30,7 @@ int main(void) +@@ -31,6 +31,7 @@ int main(void) BLANK(); DEFINE(TSK_TI_FLAGS, offsetof(struct task_struct, thread_info.flags)); DEFINE(TSK_TI_PREEMPT, offsetof(struct task_struct, thread_info.preempt_count)); + DEFINE(TSK_TI_PREEMPT_LAZY, offsetof(struct task_struct, thread_info.preempt_lazy_count)); - DEFINE(TSK_TI_ADDR_LIMIT, offsetof(struct task_struct, thread_info.addr_limit)); #ifdef CONFIG_ARM64_SW_TTBR0_PAN DEFINE(TSK_TI_TTBR0, offsetof(struct task_struct, thread_info.ttbr0)); -diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S -index 2da82c139e1c..32c907a71ca4 100644 ---- a/arch/arm64/kernel/entry.S -+++ b/arch/arm64/kernel/entry.S -@@ -651,9 +651,18 @@ alternative_if ARM64_HAS_IRQ_PRIO_MASKING - mrs x0, daif - orr x24, x24, x0 - alternative_else_nop_endif -- cbnz x24, 1f // preempt count != 0 || NMI return path -- bl arm64_preempt_schedule_irq // irq en/disable is done inside -+ -+ cbz x24, 1f // (need_resched + count) == 0 -+ cbnz w24, 2f // count != 0 -+ -+ ldr w24, [tsk, #TSK_TI_PREEMPT_LAZY] // get preempt lazy count -+ cbnz w24, 2f // preempt lazy count != 0 -+ -+ ldr x0, [tsk, #TSK_TI_FLAGS] // get flags -+ tbz x0, #TIF_NEED_RESCHED_LAZY, 2f // needs rescheduling? - 1: -+ bl arm64_preempt_schedule_irq // irq en/disable is done inside -+2: #endif - - mov x0, sp -diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c -index 50852992752b..aafe59f680e8 100644 --- a/arch/arm64/kernel/signal.c +++ b/arch/arm64/kernel/signal.c -@@ -918,7 +918,7 @@ asmlinkage void do_notify_resume(struct pt_regs *regs, - /* Check valid user FS if needed */ - addr_limit_user_check(); - +@@ -920,7 +920,7 @@ static void do_signal(struct pt_regs *re + void do_notify_resume(struct pt_regs *regs, unsigned long thread_flags) + { + do { - if (thread_flags & _TIF_NEED_RESCHED) { + if (thread_flags & _TIF_NEED_RESCHED_MASK) { /* Unmask Debug and SError for the next task */ local_daif_restore(DAIF_PROCCTX_NOIRQ); --- -2.30.2 - diff --git a/debian/patches-rt/arm64-signal-Use-ARCH_RT_DELAYS_SIGNAL_SEND.patch b/debian/patches-rt/arm64-signal-Use-ARCH_RT_DELAYS_SIGNAL_SEND.patch new file mode 100644 index 000000000..324d9171e --- /dev/null +++ b/debian/patches-rt/arm64-signal-Use-ARCH_RT_DELAYS_SIGNAL_SEND.patch @@ -0,0 +1,49 @@ +From: He Zhe <zhe.he@windriver.com> +Date: Tue, 12 Oct 2021 16:44:21 +0800 +Subject: [PATCH] arm64: signal: Use ARCH_RT_DELAYS_SIGNAL_SEND. +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The software breakpoint is handled via do_debug_exception() which +disables preemption. On PREEMPT_RT spinlock_t become sleeping locks and +must not be acquired with disabled preemption. + +Use ARCH_RT_DELAYS_SIGNAL_SEND so the signal (from send_user_sigtrap()) +is sent delayed in return to userland. + +Cc: stable-rt@vger.kernel.org +Signed-off-by: He Zhe <zhe.he@windriver.com> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20211012084421.35136-1-zhe.he@windriver.com +--- + arch/arm64/include/asm/signal.h | 4 ++++ + arch/arm64/kernel/signal.c | 8 ++++++++ + 2 files changed, 12 insertions(+) + +--- a/arch/arm64/include/asm/signal.h ++++ b/arch/arm64/include/asm/signal.h +@@ -22,4 +22,8 @@ static inline void __user *arch_untagged + } + #define arch_untagged_si_addr arch_untagged_si_addr + ++#if defined(CONFIG_PREEMPT_RT) ++#define ARCH_RT_DELAYS_SIGNAL_SEND ++#endif ++ + #endif +--- a/arch/arm64/kernel/signal.c ++++ b/arch/arm64/kernel/signal.c +@@ -928,6 +928,14 @@ void do_notify_resume(struct pt_regs *re + } else { + local_daif_restore(DAIF_PROCCTX); + ++#ifdef ARCH_RT_DELAYS_SIGNAL_SEND ++ if (unlikely(current->forced_info.si_signo)) { ++ struct task_struct *t = current; ++ force_sig_info(&t->forced_info); ++ t->forced_info.si_signo = 0; ++ } ++#endif ++ + if (thread_flags & _TIF_UPROBE) + uprobe_notify_resume(regs); + diff --git a/debian/patches-rt/arm64-sve-Delay-freeing-memory-in-fpsimd_flush_threa.patch b/debian/patches-rt/arm64-sve-Delay-freeing-memory-in-fpsimd_flush_threa.patch new file mode 100644 index 000000000..3912d4215 --- /dev/null +++ b/debian/patches-rt/arm64-sve-Delay-freeing-memory-in-fpsimd_flush_threa.patch @@ -0,0 +1,45 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Thu, 29 Jul 2021 12:52:14 +0200 +Subject: [PATCH] arm64/sve: Delay freeing memory in fpsimd_flush_thread() +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +fpsimd_flush_thread() invokes kfree() via sve_free() within a preempt disabled +section which is not working on -RT. + +Delay freeing of memory until preemption is enabled again. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + arch/arm64/kernel/fpsimd.c | 7 ++++++- + 1 file changed, 6 insertions(+), 1 deletion(-) + +--- a/arch/arm64/kernel/fpsimd.c ++++ b/arch/arm64/kernel/fpsimd.c +@@ -1033,6 +1033,7 @@ void fpsimd_thread_switch(struct task_st + void fpsimd_flush_thread(void) + { + int vl, supported_vl; ++ void *sve_state = NULL; + + if (!system_supports_fpsimd()) + return; +@@ -1045,7 +1046,10 @@ void fpsimd_flush_thread(void) + + if (system_supports_sve()) { + clear_thread_flag(TIF_SVE); +- sve_free(current); ++ ++ /* Defer kfree() while in atomic context */ ++ sve_state = current->thread.sve_state; ++ current->thread.sve_state = NULL; + + /* + * Reset the task vector length as required. +@@ -1079,6 +1083,7 @@ void fpsimd_flush_thread(void) + } + + put_cpu_fpsimd_context(); ++ kfree(sve_state); + } + + /* diff --git a/debian/patches-rt/arm64-sve-Make-kernel-FPU-protection-RT-friendly.patch b/debian/patches-rt/arm64-sve-Make-kernel-FPU-protection-RT-friendly.patch new file mode 100644 index 000000000..3dde08520 --- /dev/null +++ b/debian/patches-rt/arm64-sve-Make-kernel-FPU-protection-RT-friendly.patch @@ -0,0 +1,57 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Thu, 29 Jul 2021 10:36:30 +0200 +Subject: [PATCH] arm64/sve: Make kernel FPU protection RT friendly +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +Non RT kernels need to protect FPU against preemption and bottom half +processing. This is achieved by disabling bottom halves via +local_bh_disable() which implictly disables preemption. + +On RT kernels this protection mechanism is not sufficient because +local_bh_disable() does not disable preemption. It serializes bottom half +related processing via a CPU local lock. + +As bottom halves are running always in thread context on RT kernels +disabling preemption is the proper choice as it implicitly prevents bottom +half processing. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + arch/arm64/kernel/fpsimd.c | 16 ++++++++++++++-- + 1 file changed, 14 insertions(+), 2 deletions(-) + +--- a/arch/arm64/kernel/fpsimd.c ++++ b/arch/arm64/kernel/fpsimd.c +@@ -179,10 +179,19 @@ static void __get_cpu_fpsimd_context(voi + * + * The double-underscore version must only be called if you know the task + * can't be preempted. ++ * ++ * On RT kernels local_bh_disable() is not sufficient because it only ++ * serializes soft interrupt related sections via a local lock, but stays ++ * preemptible. Disabling preemption is the right choice here as bottom ++ * half processing is always in thread context on RT kernels so it ++ * implicitly prevents bottom half processing as well. + */ + static void get_cpu_fpsimd_context(void) + { +- local_bh_disable(); ++ if (!IS_ENABLED(CONFIG_PREEMPT_RT)) ++ local_bh_disable(); ++ else ++ preempt_disable(); + __get_cpu_fpsimd_context(); + } + +@@ -203,7 +212,10 @@ static void __put_cpu_fpsimd_context(voi + static void put_cpu_fpsimd_context(void) + { + __put_cpu_fpsimd_context(); +- local_bh_enable(); ++ if (!IS_ENABLED(CONFIG_PREEMPT_RT)) ++ local_bh_enable(); ++ else ++ preempt_enable(); + } + + static bool have_cpu_fpsimd_context(void) diff --git a/debian/patches-rt/arm64_mm_make_arch_faults_on_old_pte_check_for_migratability.patch b/debian/patches-rt/arm64_mm_make_arch_faults_on_old_pte_check_for_migratability.patch new file mode 100644 index 000000000..0ddc53f11 --- /dev/null +++ b/debian/patches-rt/arm64_mm_make_arch_faults_on_old_pte_check_for_migratability.patch @@ -0,0 +1,34 @@ +From: Valentin Schneider <valentin.schneider@arm.com> +Subject: arm64: mm: Make arch_faults_on_old_pte() check for migratability +Date: Wed, 11 Aug 2021 21:13:54 +0100 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +arch_faults_on_old_pte() relies on the calling context being +non-preemptible. CONFIG_PREEMPT_RT turns the PTE lock into a sleepable +spinlock, which doesn't disable preemption once acquired, triggering the +warning in arch_faults_on_old_pte(). + +It does however disable migration, ensuring the task remains on the same +CPU during the entirety of the critical section, making the read of +cpu_has_hw_af() safe and stable. + +Make arch_faults_on_old_pte() check migratable() instead of preemptible(). + +Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20210811201354.1976839-5-valentin.schneider@arm.com +--- + arch/arm64/include/asm/pgtable.h | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/arch/arm64/include/asm/pgtable.h ++++ b/arch/arm64/include/asm/pgtable.h +@@ -995,7 +995,7 @@ static inline void update_mmu_cache(stru + */ + static inline bool arch_faults_on_old_pte(void) + { +- WARN_ON(preemptible()); ++ WARN_ON(is_migratable()); + + return !cpu_has_hw_af(); + } diff --git a/debian/patches-rt/arm__Add_support_for_lazy_preemption.patch b/debian/patches-rt/arm__Add_support_for_lazy_preemption.patch new file mode 100644 index 000000000..e94ccd60e --- /dev/null +++ b/debian/patches-rt/arm__Add_support_for_lazy_preemption.patch @@ -0,0 +1,127 @@ +Subject: arm: Add support for lazy preemption +From: Thomas Gleixner <tglx@linutronix.de> +Date: Wed Oct 31 12:04:11 2012 +0100 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +From: Thomas Gleixner <tglx@linutronix.de> + +Implement the arm pieces for lazy preempt. + +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + +--- + arch/arm/Kconfig | 1 + + arch/arm/include/asm/thread_info.h | 6 +++++- + arch/arm/kernel/asm-offsets.c | 1 + + arch/arm/kernel/entry-armv.S | 19 ++++++++++++++++--- + arch/arm/kernel/signal.c | 3 ++- + 5 files changed, 25 insertions(+), 5 deletions(-) +--- +--- a/arch/arm/Kconfig ++++ b/arch/arm/Kconfig +@@ -109,6 +109,7 @@ config ARM + select HAVE_PERF_EVENTS + select HAVE_PERF_REGS + select HAVE_PERF_USER_STACK_DUMP ++ select HAVE_PREEMPT_LAZY + select MMU_GATHER_RCU_TABLE_FREE if SMP && ARM_LPAE + select HAVE_REGS_AND_STACK_ACCESS_API + select HAVE_RSEQ +--- a/arch/arm/include/asm/thread_info.h ++++ b/arch/arm/include/asm/thread_info.h +@@ -52,6 +52,7 @@ struct cpu_context_save { + struct thread_info { + unsigned long flags; /* low level flags */ + int preempt_count; /* 0 => preemptable, <0 => bug */ ++ int preempt_lazy_count; /* 0 => preemptable, <0 => bug */ + struct task_struct *task; /* main task structure */ + __u32 cpu; /* cpu */ + __u32 cpu_domain; /* cpu domain */ +@@ -134,6 +135,7 @@ extern int vfp_restore_user_hwstate(stru + #define TIF_SYSCALL_TRACEPOINT 6 /* syscall tracepoint instrumentation */ + #define TIF_SECCOMP 7 /* seccomp syscall filtering active */ + #define TIF_NOTIFY_SIGNAL 8 /* signal notifications exist */ ++#define TIF_NEED_RESCHED_LAZY 9 + + #define TIF_USING_IWMMXT 17 + #define TIF_MEMDIE 18 /* is terminating due to OOM killer */ +@@ -148,6 +150,7 @@ extern int vfp_restore_user_hwstate(stru + #define _TIF_SYSCALL_TRACEPOINT (1 << TIF_SYSCALL_TRACEPOINT) + #define _TIF_SECCOMP (1 << TIF_SECCOMP) + #define _TIF_NOTIFY_SIGNAL (1 << TIF_NOTIFY_SIGNAL) ++#define _TIF_NEED_RESCHED_LAZY (1 << TIF_NEED_RESCHED_LAZY) + #define _TIF_USING_IWMMXT (1 << TIF_USING_IWMMXT) + + /* Checks for any syscall work in entry-common.S */ +@@ -157,7 +160,8 @@ extern int vfp_restore_user_hwstate(stru + /* + * Change these and you break ASM code in entry-common.S + */ +-#define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_SIGPENDING | \ ++#define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY | \ ++ _TIF_SIGPENDING | \ + _TIF_NOTIFY_RESUME | _TIF_UPROBE | \ + _TIF_NOTIFY_SIGNAL) + +--- a/arch/arm/kernel/asm-offsets.c ++++ b/arch/arm/kernel/asm-offsets.c +@@ -43,6 +43,7 @@ int main(void) + BLANK(); + DEFINE(TI_FLAGS, offsetof(struct thread_info, flags)); + DEFINE(TI_PREEMPT, offsetof(struct thread_info, preempt_count)); ++ DEFINE(TI_PREEMPT_LAZY, offsetof(struct thread_info, preempt_lazy_count)); + DEFINE(TI_TASK, offsetof(struct thread_info, task)); + DEFINE(TI_CPU, offsetof(struct thread_info, cpu)); + DEFINE(TI_CPU_DOMAIN, offsetof(struct thread_info, cpu_domain)); +--- a/arch/arm/kernel/entry-armv.S ++++ b/arch/arm/kernel/entry-armv.S +@@ -206,11 +206,18 @@ ENDPROC(__dabt_svc) + + #ifdef CONFIG_PREEMPTION + ldr r8, [tsk, #TI_PREEMPT] @ get preempt count +- ldr r0, [tsk, #TI_FLAGS] @ get flags + teq r8, #0 @ if preempt count != 0 ++ bne 1f @ return from exeption ++ ldr r0, [tsk, #TI_FLAGS] @ get flags ++ tst r0, #_TIF_NEED_RESCHED @ if NEED_RESCHED is set ++ blne svc_preempt @ preempt! ++ ++ ldr r8, [tsk, #TI_PREEMPT_LAZY] @ get preempt lazy count ++ teq r8, #0 @ if preempt lazy count != 0 + movne r0, #0 @ force flags to 0 +- tst r0, #_TIF_NEED_RESCHED ++ tst r0, #_TIF_NEED_RESCHED_LAZY + blne svc_preempt ++1: + #endif + + svc_exit r5, irq = 1 @ return from exception +@@ -225,8 +232,14 @@ ENDPROC(__irq_svc) + 1: bl preempt_schedule_irq @ irq en/disable is done inside + ldr r0, [tsk, #TI_FLAGS] @ get new tasks TI_FLAGS + tst r0, #_TIF_NEED_RESCHED ++ bne 1b ++ tst r0, #_TIF_NEED_RESCHED_LAZY + reteq r8 @ go again +- b 1b ++ ldr r0, [tsk, #TI_PREEMPT_LAZY] @ get preempt lazy count ++ teq r0, #0 @ if preempt lazy count != 0 ++ beq 1b ++ ret r8 @ go again ++ + #endif + + __und_fault: +--- a/arch/arm/kernel/signal.c ++++ b/arch/arm/kernel/signal.c +@@ -607,7 +607,8 @@ do_work_pending(struct pt_regs *regs, un + */ + trace_hardirqs_off(); + do { +- if (likely(thread_flags & _TIF_NEED_RESCHED)) { ++ if (likely(thread_flags & (_TIF_NEED_RESCHED | ++ _TIF_NEED_RESCHED_LAZY))) { + schedule(); + } else { + if (unlikely(!user_mode(regs))) diff --git a/debian/patches-rt/0234-block-mq-do-not-invoke-preempt_disable.patch b/debian/patches-rt/block_mq__do_not_invoke_preempt_disable.patch index eef478ac1..7a2b94bc5 100644 --- a/debian/patches-rt/0234-block-mq-do-not-invoke-preempt_disable.patch +++ b/debian/patches-rt/block_mq__do_not_invoke_preempt_disable.patch @@ -1,23 +1,25 @@ -From ad134fd1f449a7b8982decdb96e331ae60fdb16e Mon Sep 17 00:00:00 2001 +Subject: block/mq: do not invoke preempt_disable() +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Tue Jul 14 14:26:34 2015 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Tue, 14 Jul 2015 14:26:34 +0200 -Subject: [PATCH 234/296] block/mq: do not invoke preempt_disable() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz preempt_disable() and get_cpu() don't play well together with the sleeping locks it tries to allocate later. It seems to be enough to replace it with get_cpu_light() and migrate_disable(). Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - block/blk-mq.c | 6 +++--- + block/blk-mq.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) - -diff --git a/block/blk-mq.c b/block/blk-mq.c -index 87dd67e7abdc..b293f74ea8ca 100644 +--- --- a/block/blk-mq.c +++ b/block/blk-mq.c -@@ -1585,14 +1585,14 @@ static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async, +@@ -1559,14 +1559,14 @@ static void __blk_mq_delay_run_hw_queue( return; if (!async && !(hctx->flags & BLK_MQ_F_BLOCKING)) { @@ -35,6 +37,3 @@ index 87dd67e7abdc..b293f74ea8ca 100644 } kblockd_mod_delayed_work_on(blk_mq_hctx_next_cpu(hctx), &hctx->run_work, --- -2.30.2 - diff --git a/debian/patches-rt/0110-cgroup-use-irqsave-in-cgroup_rstat_flush_locked.patch b/debian/patches-rt/cgroup__use_irqsave_in_cgroup_rstat_flush_locked.patch index 161595663..b381a2f81 100644 --- a/debian/patches-rt/0110-cgroup-use-irqsave-in-cgroup_rstat_flush_locked.patch +++ b/debian/patches-rt/cgroup__use_irqsave_in_cgroup_rstat_flush_locked.patch @@ -1,8 +1,9 @@ -From cbe6ddbe9b3cba931d2b4b9ca72981f1f03fa97b Mon Sep 17 00:00:00 2001 +Subject: cgroup: use irqsave in cgroup_rstat_flush_locked() +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Tue Jul 3 18:19:48 2018 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Tue, 3 Jul 2018 18:19:48 +0200 -Subject: [PATCH 110/296] cgroup: use irqsave in cgroup_rstat_flush_locked() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz All callers of cgroup_rstat_flush_locked() acquire cgroup_rstat_lock either with spin_lock_irq() or spin_lock_irqsave(). @@ -17,15 +18,17 @@ the interrupts were not disabled here and a deadlock is possible. Acquire the raw_spin_lock_t with disabled interrupts. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Link: https://www.spinics.net/lists/cgroups/msg23051.html + + --- - kernel/cgroup/rstat.c | 5 +++-- + kernel/cgroup/rstat.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) - -diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c -index d51175cedfca..b424f3157b34 100644 +--- --- a/kernel/cgroup/rstat.c +++ b/kernel/cgroup/rstat.c -@@ -149,8 +149,9 @@ static void cgroup_rstat_flush_locked(struct cgroup *cgrp, bool may_sleep) +@@ -156,8 +156,9 @@ static void cgroup_rstat_flush_locked(st raw_spinlock_t *cpu_lock = per_cpu_ptr(&cgroup_rstat_cpu_lock, cpu); struct cgroup *pos = NULL; @@ -36,7 +39,7 @@ index d51175cedfca..b424f3157b34 100644 while ((pos = cgroup_rstat_cpu_pop_updated(pos, cgrp, cpu))) { struct cgroup_subsys_state *css; -@@ -162,7 +163,7 @@ static void cgroup_rstat_flush_locked(struct cgroup *cgrp, bool may_sleep) +@@ -169,7 +170,7 @@ static void cgroup_rstat_flush_locked(st css->ss->css_rstat_flush(css, cpu); rcu_read_unlock(); } @@ -45,6 +48,3 @@ index d51175cedfca..b424f3157b34 100644 /* if @may_sleep, play nice and yield if necessary */ if (may_sleep && (need_resched() || --- -2.30.2 - diff --git a/debian/patches-rt/console__add_write_atomic_interface.patch b/debian/patches-rt/console__add_write_atomic_interface.patch new file mode 100644 index 000000000..8de52e9a7 --- /dev/null +++ b/debian/patches-rt/console__add_write_atomic_interface.patch @@ -0,0 +1,315 @@ +Subject: console: add write_atomic interface +From: John Ogness <john.ogness@linutronix.de> +Date: Mon Nov 30 01:42:01 2020 +0106 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +From: John Ogness <john.ogness@linutronix.de> + +Add a write_atomic() callback to the console. This is an optional +function for console drivers. The function must be atomic (including +NMI safe) for writing to the console. + +Console drivers must still implement the write() callback. The +write_atomic() callback will only be used in special situations, +such as when the kernel panics. + +Creating an NMI safe write_atomic() that must synchronize with +write() requires a careful implementation of the console driver. To +aid with the implementation, a set of console_atomic_*() functions +are provided: + + void console_atomic_lock(unsigned long flags); + void console_atomic_unlock(unsigned long flags); + +These functions synchronize using the printk cpulock and disable +hardware interrupts. + +kgdb makes use of its own cpulock (@dbg_master_lock, @kgdb_active) +during cpu roundup. This will conflict with the printk cpulock. +Therefore, a CPU must ensure that it is not holding the printk +cpulock when calling kgdb_cpu_enter(). If it is, it must allow its +printk context to complete first. + +A new helper function kgdb_roundup_delay() is introduced for kgdb +to determine if it is holding the printk cpulock. If so, a flag is +set so that when the printk cpulock is released, kgdb will be +re-triggered for that CPU. + +Signed-off-by: John Ogness <john.ogness@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + +--- + arch/powerpc/include/asm/smp.h | 1 + arch/powerpc/kernel/kgdb.c | 10 ++++++++- + arch/powerpc/kernel/smp.c | 5 ++++ + arch/x86/kernel/kgdb.c | 9 +++++--- + include/linux/console.h | 1 + include/linux/kgdb.h | 3 ++ + include/linux/printk.h | 23 ++++++++++++++++++++ + kernel/debug/debug_core.c | 45 +++++++++++++++++++++++------------------ + kernel/printk/printk.c | 26 +++++++++++++++++++++++ + 9 files changed, 100 insertions(+), 23 deletions(-) +--- +--- a/arch/powerpc/include/asm/smp.h ++++ b/arch/powerpc/include/asm/smp.h +@@ -62,6 +62,7 @@ struct smp_ops_t { + + extern int smp_send_nmi_ipi(int cpu, void (*fn)(struct pt_regs *), u64 delay_us); + extern int smp_send_safe_nmi_ipi(int cpu, void (*fn)(struct pt_regs *), u64 delay_us); ++extern void smp_send_debugger_break_cpu(unsigned int cpu); + extern void smp_send_debugger_break(void); + extern void start_secondary_resume(void); + extern void smp_generic_give_timebase(void); +--- a/arch/powerpc/kernel/kgdb.c ++++ b/arch/powerpc/kernel/kgdb.c +@@ -120,11 +120,19 @@ int kgdb_skipexception(int exception, st + + static int kgdb_debugger_ipi(struct pt_regs *regs) + { +- kgdb_nmicallback(raw_smp_processor_id(), regs); ++ int cpu = raw_smp_processor_id(); ++ ++ if (!kgdb_roundup_delay(cpu)) ++ kgdb_nmicallback(cpu, regs); + return 0; + } + + #ifdef CONFIG_SMP ++void kgdb_roundup_cpu(unsigned int cpu) ++{ ++ smp_send_debugger_break_cpu(cpu); ++} ++ + void kgdb_roundup_cpus(void) + { + smp_send_debugger_break(); +--- a/arch/powerpc/kernel/smp.c ++++ b/arch/powerpc/kernel/smp.c +@@ -589,6 +589,11 @@ static void debugger_ipi_callback(struct + debugger_ipi(regs); + } + ++void smp_send_debugger_break_cpu(unsigned int cpu) ++{ ++ smp_send_nmi_ipi(cpu, debugger_ipi_callback, 1000000); ++} ++ + void smp_send_debugger_break(void) + { + smp_send_nmi_ipi(NMI_IPI_ALL_OTHERS, debugger_ipi_callback, 1000000); +--- a/arch/x86/kernel/kgdb.c ++++ b/arch/x86/kernel/kgdb.c +@@ -502,9 +502,12 @@ static int kgdb_nmi_handler(unsigned int + if (atomic_read(&kgdb_active) != -1) { + /* KGDB CPU roundup */ + cpu = raw_smp_processor_id(); +- kgdb_nmicallback(cpu, regs); +- set_bit(cpu, was_in_debug_nmi); +- touch_nmi_watchdog(); ++ ++ if (!kgdb_roundup_delay(cpu)) { ++ kgdb_nmicallback(cpu, regs); ++ set_bit(cpu, was_in_debug_nmi); ++ touch_nmi_watchdog(); ++ } + + return NMI_HANDLED; + } +--- a/include/linux/console.h ++++ b/include/linux/console.h +@@ -140,6 +140,7 @@ static inline int con_debug_leave(void) + struct console { + char name[16]; + void (*write)(struct console *, const char *, unsigned); ++ void (*write_atomic)(struct console *co, const char *s, unsigned int count); + int (*read)(struct console *, char *, unsigned); + struct tty_driver *(*device)(struct console *, int *); + void (*unblank)(void); +--- a/include/linux/kgdb.h ++++ b/include/linux/kgdb.h +@@ -212,6 +212,8 @@ extern void kgdb_call_nmi_hook(void *ign + */ + extern void kgdb_roundup_cpus(void); + ++extern void kgdb_roundup_cpu(unsigned int cpu); ++ + /** + * kgdb_arch_set_pc - Generic call back to the program counter + * @regs: Current &struct pt_regs. +@@ -365,5 +367,6 @@ extern void kgdb_free_init_mem(void); + #define dbg_late_init() + static inline void kgdb_panic(const char *msg) {} + static inline void kgdb_free_init_mem(void) { } ++static inline void kgdb_roundup_cpu(unsigned int cpu) {} + #endif /* ! CONFIG_KGDB */ + #endif /* _KGDB_H_ */ +--- a/include/linux/printk.h ++++ b/include/linux/printk.h +@@ -280,10 +280,18 @@ static inline void dump_stack(void) + extern int __printk_cpu_trylock(void); + extern void __printk_wait_on_cpu_lock(void); + extern void __printk_cpu_unlock(void); ++extern bool kgdb_roundup_delay(unsigned int cpu); ++ + #else ++ + #define __printk_cpu_trylock() 1 + #define __printk_wait_on_cpu_lock() + #define __printk_cpu_unlock() ++ ++static inline bool kgdb_roundup_delay(unsigned int cpu) ++{ ++ return false; ++} + #endif /* CONFIG_SMP */ + + /** +@@ -315,6 +323,21 @@ extern void __printk_cpu_unlock(void); + local_irq_restore(flags); \ + } while (0) + ++/* ++ * Used to synchronize atomic consoles. ++ * ++ * The same as raw_printk_cpu_lock_irqsave() except that hardware interrupts ++ * are _not_ restored while spinning. ++ */ ++#define console_atomic_lock(flags) \ ++ do { \ ++ local_irq_save(flags); \ ++ while (!__printk_cpu_trylock()) \ ++ cpu_relax(); \ ++ } while (0) ++ ++#define console_atomic_unlock raw_printk_cpu_unlock_irqrestore ++ + extern int kptr_restrict; + + /** +--- a/kernel/debug/debug_core.c ++++ b/kernel/debug/debug_core.c +@@ -238,35 +238,42 @@ NOKPROBE_SYMBOL(kgdb_call_nmi_hook); + static DEFINE_PER_CPU(call_single_data_t, kgdb_roundup_csd) = + CSD_INIT(kgdb_call_nmi_hook, NULL); + +-void __weak kgdb_roundup_cpus(void) ++void __weak kgdb_roundup_cpu(unsigned int cpu) + { + call_single_data_t *csd; ++ int ret; ++ ++ csd = &per_cpu(kgdb_roundup_csd, cpu); ++ ++ /* ++ * If it didn't round up last time, don't try again ++ * since smp_call_function_single_async() will block. ++ * ++ * If rounding_up is false then we know that the ++ * previous call must have at least started and that ++ * means smp_call_function_single_async() won't block. ++ */ ++ if (kgdb_info[cpu].rounding_up) ++ return; ++ kgdb_info[cpu].rounding_up = true; ++ ++ ret = smp_call_function_single_async(cpu, csd); ++ if (ret) ++ kgdb_info[cpu].rounding_up = false; ++} ++NOKPROBE_SYMBOL(kgdb_roundup_cpu); ++ ++void __weak kgdb_roundup_cpus(void) ++{ + int this_cpu = raw_smp_processor_id(); + int cpu; +- int ret; + + for_each_online_cpu(cpu) { + /* No need to roundup ourselves */ + if (cpu == this_cpu) + continue; + +- csd = &per_cpu(kgdb_roundup_csd, cpu); +- +- /* +- * If it didn't round up last time, don't try again +- * since smp_call_function_single_async() will block. +- * +- * If rounding_up is false then we know that the +- * previous call must have at least started and that +- * means smp_call_function_single_async() won't block. +- */ +- if (kgdb_info[cpu].rounding_up) +- continue; +- kgdb_info[cpu].rounding_up = true; +- +- ret = smp_call_function_single_async(cpu, csd); +- if (ret) +- kgdb_info[cpu].rounding_up = false; ++ kgdb_roundup_cpu(cpu); + } + } + NOKPROBE_SYMBOL(kgdb_roundup_cpus); +--- a/kernel/printk/printk.c ++++ b/kernel/printk/printk.c +@@ -44,6 +44,7 @@ + #include <linux/irq_work.h> + #include <linux/ctype.h> + #include <linux/uio.h> ++#include <linux/kgdb.h> + #include <linux/sched/clock.h> + #include <linux/sched/debug.h> + #include <linux/sched/task_stack.h> +@@ -3582,6 +3583,7 @@ EXPORT_SYMBOL_GPL(kmsg_dump_rewind); + #ifdef CONFIG_SMP + static atomic_t printk_cpulock_owner = ATOMIC_INIT(-1); + static atomic_t printk_cpulock_nested = ATOMIC_INIT(0); ++static unsigned int kgdb_cpu = -1; + + /** + * __printk_wait_on_cpu_lock() - Busy wait until the printk cpu-reentrant +@@ -3661,6 +3663,9 @@ EXPORT_SYMBOL(__printk_cpu_trylock); + */ + void __printk_cpu_unlock(void) + { ++ bool trigger_kgdb = false; ++ unsigned int cpu; ++ + if (atomic_read(&printk_cpulock_nested)) { + atomic_dec(&printk_cpulock_nested); + return; +@@ -3671,6 +3676,12 @@ void __printk_cpu_unlock(void) + * LMM(__printk_cpu_unlock:A) + */ + ++ cpu = smp_processor_id(); ++ if (kgdb_cpu == cpu) { ++ trigger_kgdb = true; ++ kgdb_cpu = -1; ++ } ++ + /* + * Guarantee loads and stores from this CPU when it was the + * lock owner are visible to the next lock owner. This pairs +@@ -3691,6 +3702,21 @@ void __printk_cpu_unlock(void) + */ + atomic_set_release(&printk_cpulock_owner, + -1); /* LMM(__printk_cpu_unlock:B) */ ++ ++ if (trigger_kgdb) { ++ pr_warn("re-triggering kgdb roundup for CPU#%d\n", cpu); ++ kgdb_roundup_cpu(cpu); ++ } + } + EXPORT_SYMBOL(__printk_cpu_unlock); ++ ++bool kgdb_roundup_delay(unsigned int cpu) ++{ ++ if (cpu != atomic_read(&printk_cpulock_owner)) ++ return false; ++ ++ kgdb_cpu = cpu; ++ return true; ++} ++EXPORT_SYMBOL(kgdb_roundup_delay); + #endif /* CONFIG_SMP */ diff --git a/debian/patches-rt/0248-crypto-cryptd-add-a-lock-instead-preempt_disable-loc.patch b/debian/patches-rt/crypto__cryptd_-_add_a_lock_instead_preempt_disable_local_bh_disable.patch index 5c21662fb..dfac1de2f 100644 --- a/debian/patches-rt/0248-crypto-cryptd-add-a-lock-instead-preempt_disable-loc.patch +++ b/debian/patches-rt/crypto__cryptd_-_add_a_lock_instead_preempt_disable_local_bh_disable.patch @@ -1,9 +1,9 @@ -From 7f0e21866560dd00440e01df861ca86d3ac0e6ec Mon Sep 17 00:00:00 2001 +Subject: crypto: cryptd - add a lock instead preempt_disable/local_bh_disable +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Thu Jul 26 18:52:00 2018 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Thu, 26 Jul 2018 18:52:00 +0200 -Subject: [PATCH 248/296] crypto: cryptd - add a lock instead - preempt_disable/local_bh_disable -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz cryptd has a per-CPU lock which protected with local_bh_disable() and preempt_disable(). @@ -15,15 +15,16 @@ after the cpu_queue has been obtain. This is not a problem because the actual ressource is protected by the spinlock. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - crypto/cryptd.c | 19 +++++++++---------- + crypto/cryptd.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) - -diff --git a/crypto/cryptd.c b/crypto/cryptd.c -index a1bea0f4baa8..5f8ca8c1f59c 100644 +--- --- a/crypto/cryptd.c +++ b/crypto/cryptd.c -@@ -36,6 +36,7 @@ static struct workqueue_struct *cryptd_wq; +@@ -36,6 +36,7 @@ static struct workqueue_struct *cryptd_w struct cryptd_cpu_queue { struct crypto_queue queue; struct work_struct work; @@ -31,7 +32,7 @@ index a1bea0f4baa8..5f8ca8c1f59c 100644 }; struct cryptd_queue { -@@ -105,6 +106,7 @@ static int cryptd_init_queue(struct cryptd_queue *queue, +@@ -105,6 +106,7 @@ static int cryptd_init_queue(struct cryp cpu_queue = per_cpu_ptr(queue->cpu_queue, cpu); crypto_init_queue(&cpu_queue->queue, max_cpu_qlen); INIT_WORK(&cpu_queue->work, cryptd_queue_worker); @@ -39,7 +40,7 @@ index a1bea0f4baa8..5f8ca8c1f59c 100644 } pr_info("cryptd: max_cpu_qlen set to %d\n", max_cpu_qlen); return 0; -@@ -129,8 +131,10 @@ static int cryptd_enqueue_request(struct cryptd_queue *queue, +@@ -129,8 +131,10 @@ static int cryptd_enqueue_request(struct struct cryptd_cpu_queue *cpu_queue; refcount_t *refcnt; @@ -52,7 +53,7 @@ index a1bea0f4baa8..5f8ca8c1f59c 100644 err = crypto_enqueue_request(&cpu_queue->queue, request); refcnt = crypto_tfm_ctx(request->tfm); -@@ -146,7 +150,7 @@ static int cryptd_enqueue_request(struct cryptd_queue *queue, +@@ -146,7 +150,7 @@ static int cryptd_enqueue_request(struct refcount_inc(refcnt); out_put_cpu: @@ -61,7 +62,7 @@ index a1bea0f4baa8..5f8ca8c1f59c 100644 return err; } -@@ -162,16 +166,11 @@ static void cryptd_queue_worker(struct work_struct *work) +@@ -162,16 +166,11 @@ static void cryptd_queue_worker(struct w cpu_queue = container_of(work, struct cryptd_cpu_queue, work); /* * Only handle one request at a time to avoid hogging crypto workqueue. @@ -80,6 +81,3 @@ index a1bea0f4baa8..5f8ca8c1f59c 100644 if (!req) return; --- -2.30.2 - diff --git a/debian/patches-rt/crypto_testmgr_only_disable_migration_in_crypto_disable_simd_for_test.patch b/debian/patches-rt/crypto_testmgr_only_disable_migration_in_crypto_disable_simd_for_test.patch new file mode 100644 index 000000000..362ebc4b7 --- /dev/null +++ b/debian/patches-rt/crypto_testmgr_only_disable_migration_in_crypto_disable_simd_for_test.patch @@ -0,0 +1,42 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Subject: crypto: testmgr - Only disable migration in crypto_disable_simd_for_test() +Date: Tue, 28 Sep 2021 13:54:01 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +crypto_disable_simd_for_test() disables preemption in order to receive a +stable per-CPU variable which it needs to modify in order to alter +crypto_simd_usable() results. + +This can also be achived by migrate_disable() which forbidds CPU +migrations but allows the task to be preempted. The latter is important +for PREEMPT_RT since operation like skcipher_walk_first() may allocate +memory which must not happen with disabled preemption on PREEMPT_RT. + +Use migrate_disable() in crypto_disable_simd_for_test() to achieve a +stable per-CPU pointer. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20210928115401.441339-1-bigeasy@linutronix.de +--- + crypto/testmgr.c | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +--- a/crypto/testmgr.c ++++ b/crypto/testmgr.c +@@ -1061,14 +1061,14 @@ static void generate_random_testvec_conf + + static void crypto_disable_simd_for_test(void) + { +- preempt_disable(); ++ migrate_disable(); + __this_cpu_write(crypto_simd_disabled_for_test, true); + } + + static void crypto_reenable_simd_for_test(void) + { + __this_cpu_write(crypto_simd_disabled_for_test, false); +- preempt_enable(); ++ migrate_enable(); + } + + /* diff --git a/debian/patches-rt/0288-drivers-block-zram-Replace-bit-spinlocks-with-rtmute.patch b/debian/patches-rt/drivers_block_zram__Replace_bit_spinlocks_with_rtmutex_for_-rt.patch index 5f19d119d..560be2790 100644 --- a/debian/patches-rt/0288-drivers-block-zram-Replace-bit-spinlocks-with-rtmute.patch +++ b/debian/patches-rt/drivers_block_zram__Replace_bit_spinlocks_with_rtmutex_for_-rt.patch @@ -1,25 +1,26 @@ -From 9ae56f02509846f2619ab71abb19cd0704f46f8c Mon Sep 17 00:00:00 2001 +Subject: drivers/block/zram: Replace bit spinlocks with rtmutex for -rt +From: Mike Galbraith <umgwanakikbuti@gmail.com> +Date: Thu Mar 31 04:08:28 2016 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Mike Galbraith <umgwanakikbuti@gmail.com> -Date: Thu, 31 Mar 2016 04:08:28 +0200 -Subject: [PATCH 288/296] drivers/block/zram: Replace bit spinlocks with - rtmutex for -rt -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz They're nondeterministic, and lead to ___might_sleep() splats in -rt. OTOH, they're a lot less wasteful than an rtmutex per page. Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - drivers/block/zram/zram_drv.c | 36 +++++++++++++++++++++++++++++++++++ - drivers/block/zram/zram_drv.h | 1 + + drivers/block/zram/zram_drv.c | 36 ++++++++++++++++++++++++++++++++++++ + drivers/block/zram/zram_drv.h | 1 + 2 files changed, 37 insertions(+) - -diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c -index 7dce17fd59ba..7eeffb4e95f4 100644 +--- --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c -@@ -59,6 +59,40 @@ static void zram_free_page(struct zram *zram, size_t index); +@@ -59,6 +59,40 @@ static void zram_free_page(struct zram * static int zram_bvec_read(struct zram *zram, struct bio_vec *bvec, u32 index, int offset, struct bio *bio); @@ -60,7 +61,7 @@ index 7dce17fd59ba..7eeffb4e95f4 100644 static int zram_slot_trylock(struct zram *zram, u32 index) { -@@ -74,6 +108,7 @@ static void zram_slot_unlock(struct zram *zram, u32 index) +@@ -74,6 +108,7 @@ static void zram_slot_unlock(struct zram { bit_spin_unlock(ZRAM_LOCK, &zram->table[index].flags); } @@ -68,7 +69,7 @@ index 7dce17fd59ba..7eeffb4e95f4 100644 static inline bool init_done(struct zram *zram) { -@@ -1165,6 +1200,7 @@ static bool zram_meta_alloc(struct zram *zram, u64 disksize) +@@ -1169,6 +1204,7 @@ static bool zram_meta_alloc(struct zram if (!huge_class_size) huge_class_size = zs_huge_class_size(zram->mem_pool); @@ -76,8 +77,6 @@ index 7dce17fd59ba..7eeffb4e95f4 100644 return true; } -diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h -index f2fd46daa760..7e4dd447e1dd 100644 --- a/drivers/block/zram/zram_drv.h +++ b/drivers/block/zram/zram_drv.h @@ -63,6 +63,7 @@ struct zram_table_entry { @@ -88,6 +87,3 @@ index f2fd46daa760..7e4dd447e1dd 100644 #ifdef CONFIG_ZRAM_MEMORY_TRACKING ktime_t ac_time; #endif --- -2.30.2 - diff --git a/debian/patches-rt/0196-efi-Allow-efi-runtime.patch b/debian/patches-rt/efi-Allow-efi-runtime.patch index c5a615baf..2b0c0fa78 100644 --- a/debian/patches-rt/0196-efi-Allow-efi-runtime.patch +++ b/debian/patches-rt/efi-Allow-efi-runtime.patch @@ -1,23 +1,24 @@ -From de344d1415d6fb7db4a9e463c643a82533e7c0ea Mon Sep 17 00:00:00 2001 From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Date: Thu, 26 Jul 2018 15:06:10 +0200 -Subject: [PATCH 196/296] efi: Allow efi=runtime -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz +Subject: [PATCH] efi: Allow efi=runtime +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz In case the command line option "efi=noruntime" is default at built-time, the user could overwrite its state by `efi=runtime' and allow it again. +This is useful on PREEMPT_RT where "efi=noruntime" is default and the +user might need to alter the boot order for instance. + Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lkml.kernel.org/r/20210924134919.1913476-3-bigeasy@linutronix.de --- - drivers/firmware/efi/efi.c | 3 +++ + drivers/firmware/efi/efi.c | 3 +++ 1 file changed, 3 insertions(+) -diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c -index 85496063022d..abb18c958e3b 100644 --- a/drivers/firmware/efi/efi.c +++ b/drivers/firmware/efi/efi.c -@@ -97,6 +97,9 @@ static int __init parse_efi_cmdline(char *str) +@@ -97,6 +97,9 @@ static int __init parse_efi_cmdline(char if (parse_option_str(str, "noruntime")) disable_runtime = true; @@ -27,6 +28,3 @@ index 85496063022d..abb18c958e3b 100644 if (parse_option_str(str, "nosoftreserve")) set_bit(EFI_MEM_NO_SOFT_RESERVE, &efi.flags); --- -2.30.2 - diff --git a/debian/patches-rt/0195-efi-Disable-runtime-services-on-RT.patch b/debian/patches-rt/efi-Disable-runtime-services-on-RT.patch index 00218cea9..c4a4099f6 100644 --- a/debian/patches-rt/0195-efi-Disable-runtime-services-on-RT.patch +++ b/debian/patches-rt/efi-Disable-runtime-services-on-RT.patch @@ -1,35 +1,33 @@ -From 51397f4bbffcf4addaeeab802240ab6e5fba91c8 Mon Sep 17 00:00:00 2001 From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Date: Thu, 26 Jul 2018 15:03:16 +0200 -Subject: [PATCH 195/296] efi: Disable runtime services on RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz +Subject: [PATCH] efi: Disable runtime services on RT +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz -Based on meassurements the EFI functions get_variable / +Based on measurements the EFI functions get_variable / get_next_variable take up to 2us which looks okay. -The functions get_time, set_time take around 10ms. Those 10ms are too +The functions get_time, set_time take around 10ms. These 10ms are too much. Even one ms would be too much. Ard mentioned that SetVariable might even trigger larger latencies if -the firware will erase flash blocks on NOR. +the firmware will erase flash blocks on NOR. The time-functions are used by efi-rtc and can be triggered during -runtimed (either via explicit read/write or ntp sync). +run-time (either via explicit read/write or ntp sync). The variable write could be used by pstore. These functions can be disabled without much of a loss. The poweroff / reboot hooks may be provided by PSCI. -Disable EFI's runtime wrappers. +Disable EFI's runtime wrappers on PREEMPT_RT. This was observed on "EFI v2.60 by SoftIron Overdrive 1000". Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lkml.kernel.org/r/20210924134919.1913476-2-bigeasy@linutronix.de --- - drivers/firmware/efi/efi.c | 2 +- + drivers/firmware/efi/efi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c -index 4b7ee3fa9224..85496063022d 100644 --- a/drivers/firmware/efi/efi.c +++ b/drivers/firmware/efi/efi.c @@ -66,7 +66,7 @@ struct mm_struct efi_mm = { @@ -41,6 +39,3 @@ index 4b7ee3fa9224..85496063022d 100644 static int __init setup_noefi(char *arg) { disable_runtime = true; --- -2.30.2 - diff --git a/debian/patches-rt/entry--Fix-the-preempt-lazy-fallout.patch b/debian/patches-rt/entry--Fix-the-preempt-lazy-fallout.patch new file mode 100644 index 000000000..f852cc58a --- /dev/null +++ b/debian/patches-rt/entry--Fix-the-preempt-lazy-fallout.patch @@ -0,0 +1,41 @@ +Subject: entry: Fix the preempt lazy fallout +From: Thomas Gleixner <tglx@linutronix.de> +Date: Tue, 13 Jul 2021 07:52:52 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +Common code needs common defines.... + +Fixes: f2f9e496208c ("x86: Support for lazy preemption") +Reported-by: kernel test robot <lkp@intel.com> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +--- + arch/x86/include/asm/thread_info.h | 2 -- + include/linux/entry-common.h | 6 ++++++ + 2 files changed, 6 insertions(+), 2 deletions(-) + +--- a/arch/x86/include/asm/thread_info.h ++++ b/arch/x86/include/asm/thread_info.h +@@ -150,8 +150,6 @@ struct thread_info { + + #define _TIF_WORK_CTXSW_NEXT (_TIF_WORK_CTXSW) + +-#define _TIF_NEED_RESCHED_MASK (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY) +- + #define STACK_WARN (THREAD_SIZE/8) + + /* +--- a/include/linux/entry-common.h ++++ b/include/linux/entry-common.h +@@ -57,6 +57,12 @@ + # define ARCH_EXIT_TO_USER_MODE_WORK (0) + #endif + ++#ifdef CONFIG_PREEMPT_LAZY ++# define _TIF_NEED_RESCHED_MASK (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY) ++#else ++# define _TIF_NEED_RESCHED_MASK (_TIF_NEED_RESCHED) ++#endif ++ + #define EXIT_TO_USER_MODE_WORK \ + (_TIF_SIGPENDING | _TIF_NOTIFY_RESUME | _TIF_UPROBE | \ + _TIF_NEED_RESCHED_MASK | _TIF_PATCH_PENDING | _TIF_NOTIFY_SIGNAL | \ diff --git a/debian/patches-rt/fs-namespace-Boost-the-mount_lock.lock-owner-instead.patch b/debian/patches-rt/fs-namespace-Boost-the-mount_lock.lock-owner-instead.patch new file mode 100644 index 000000000..10b82429e --- /dev/null +++ b/debian/patches-rt/fs-namespace-Boost-the-mount_lock.lock-owner-instead.patch @@ -0,0 +1,58 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Mon, 25 Oct 2021 16:49:35 +0200 +Subject: [PATCH] fs/namespace: Boost the mount_lock.lock owner instead of + spinning on PREEMPT_RT. +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The MNT_WRITE_HOLD flag is used to hold back any new writers while the +mount point is about to be made read-only. __mnt_want_write() then loops +with disabled preemption until this flag disappears. Callers of +mnt_hold_writers() (which sets the flag) hold the spinlock_t of +mount_lock (seqlock_t) which disables preemption on !PREEMPT_RT and +ensures the task is not scheduled away so that the spinning side spins +for a long time. + +On PREEMPT_RT the spinlock_t does not disable preemption and so it is +possible that the task setting MNT_WRITE_HOLD is preempted by task with +higher priority which then spins infinitely waiting for MNT_WRITE_HOLD +to get removed. + +Acquire mount_lock::lock which is held by setter of MNT_WRITE_HOLD. This +will PI-boost the owner and wait until the lock is dropped and which +means that MNT_WRITE_HOLD is cleared again. + +Link: https://lkml.kernel.org/r/20211025152218.opvcqfku2lhqvp4o@linutronix.de +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + fs/namespace.c | 20 ++++++++++++++++++-- + 1 file changed, 18 insertions(+), 2 deletions(-) + +--- a/fs/namespace.c ++++ b/fs/namespace.c +@@ -343,8 +343,24 @@ int __mnt_want_write(struct vfsmount *m) + * incremented count after it has set MNT_WRITE_HOLD. + */ + smp_mb(); +- while (READ_ONCE(mnt->mnt.mnt_flags) & MNT_WRITE_HOLD) +- cpu_relax(); ++ might_lock(&mount_lock.lock); ++ while (READ_ONCE(mnt->mnt.mnt_flags) & MNT_WRITE_HOLD) { ++ if (!IS_ENABLED(CONFIG_PREEMPT_RT)) { ++ cpu_relax(); ++ } else { ++ /* ++ * This prevents priority inversion, if the task ++ * setting MNT_WRITE_HOLD got preempted on a remote ++ * CPU, and it prevents life lock if the task setting ++ * MNT_WRITE_HOLD has a lower priority and is bound to ++ * the same CPU as the task that is spinning here. ++ */ ++ preempt_enable(); ++ lock_mount_hash(); ++ unlock_mount_hash(); ++ preempt_disable(); ++ } ++ } + /* + * After the slowpath clears MNT_WRITE_HOLD, mnt_is_readonly will + * be set to match its requirements. So we must not load that until diff --git a/debian/patches-rt/fs_dcache__disable_preemption_on_i_dir_seqs_write_side.patch b/debian/patches-rt/fs_dcache__disable_preemption_on_i_dir_seqs_write_side.patch new file mode 100644 index 000000000..28007f33f --- /dev/null +++ b/debian/patches-rt/fs_dcache__disable_preemption_on_i_dir_seqs_write_side.patch @@ -0,0 +1,65 @@ +Subject: fs/dcache: disable preemption on i_dir_seq's write side +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Fri Oct 20 11:29:53 2017 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> + +i_dir_seq is a sequence counter with a lock which is represented by the lowest +bit. The writer atomically updates the counter which ensures that it can be +modified by only one writer at a time. +The commit introducing this change claims that the lock has been integrated +into the counter for space reasons within the inode struct. The i_dir_seq +member is within a union which shares also a pointer. That means by using +seqlock_t we would have a sequence counter and a lock without increasing the +size of the data structure on 64bit and 32bit would grow by 4 bytes. With +lockdep enabled the size would grow and on PREEMPT_RT the spinlock_t is also +larger. + +In order to keep this construct working on PREEMPT_RT, the writer needs to +disable preemption while obtaining the lock on the sequence counter / starting +the write critical section. The writer acquires an otherwise unrelated +spinlock_t which serves the same purpose on !PREEMPT_RT. With enabled +preemption a high priority reader could preempt the writer and live lock the +system while waiting for the locked bit to disappear. + +Another solution would be to have global spinlock_t which is always acquired +by the writer. The reader would then acquire the lock if the sequence count is +odd and by doing so force the writer out of the critical section. The global +spinlock_t could be replaced by a hashed lock based on the address of the inode +to lower the lock contention. + +For now, manually disable preemption on PREEMPT_RT to avoid live locks. + +Reported-by: Oleg.Karfich@wago.com +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + fs/dcache.c | 10 +++++++++- + 1 file changed, 9 insertions(+), 1 deletion(-) +--- +--- a/fs/dcache.c ++++ b/fs/dcache.c +@@ -2537,7 +2537,13 @@ EXPORT_SYMBOL(d_rehash); + + static inline unsigned start_dir_add(struct inode *dir) + { +- ++ /* ++ * The caller has a spinlock_t (dentry::d_lock) acquired which disables ++ * preemption on !PREEMPT_RT. On PREEMPT_RT the lock does not disable ++ * preemption and it has be done explicitly. ++ */ ++ if (IS_ENABLED(CONFIG_PREEMPT_RT)) ++ preempt_disable(); + for (;;) { + unsigned n = dir->i_dir_seq; + if (!(n & 1) && cmpxchg(&dir->i_dir_seq, n, n + 1) == n) +@@ -2549,6 +2555,8 @@ static inline unsigned start_dir_add(str + static inline void end_dir_add(struct inode *dir, unsigned n) + { + smp_store_release(&dir->i_dir_seq, n + 2); ++ if (IS_ENABLED(CONFIG_PREEMPT_RT)) ++ preempt_enable(); + } + + static void d_wait_lookup(struct dentry *dentry) diff --git a/debian/patches-rt/0187-fs-dcache-use-swait_queue-instead-of-waitqueue.patch b/debian/patches-rt/fs_dcache__use_swait_queue_instead_of_waitqueue.patch index af80b4148..151512491 100644 --- a/debian/patches-rt/0187-fs-dcache-use-swait_queue-instead-of-waitqueue.patch +++ b/debian/patches-rt/fs_dcache__use_swait_queue_instead_of_waitqueue.patch @@ -1,33 +1,35 @@ -From b5aba70f6958a2dba76bf013bb8816cf8083468f Mon Sep 17 00:00:00 2001 +Subject: fs/dcache: use swait_queue instead of waitqueue +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Wed Sep 14 14:35:49 2016 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Wed, 14 Sep 2016 14:35:49 +0200 -Subject: [PATCH 187/296] fs/dcache: use swait_queue instead of waitqueue -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz __d_lookup_done() invokes wake_up_all() while holding a hlist_bl_lock() which disables preemption. As a workaround convert it to swait. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - fs/afs/dir_silly.c | 2 +- - fs/cifs/readdir.c | 2 +- - fs/dcache.c | 27 +++++++++++++++------------ - fs/fuse/readdir.c | 2 +- - fs/namei.c | 4 ++-- - fs/nfs/dir.c | 4 ++-- - fs/nfs/unlink.c | 4 ++-- - fs/proc/base.c | 3 ++- - fs/proc/proc_sysctl.c | 2 +- - include/linux/dcache.h | 4 ++-- - include/linux/nfs_xdr.h | 2 +- - kernel/sched/swait.c | 1 + + fs/afs/dir_silly.c | 2 +- + fs/cifs/readdir.c | 2 +- + fs/dcache.c | 27 +++++++++++++++------------ + fs/fuse/readdir.c | 2 +- + fs/namei.c | 4 ++-- + fs/nfs/dir.c | 4 ++-- + fs/nfs/unlink.c | 4 ++-- + fs/proc/base.c | 3 ++- + fs/proc/proc_sysctl.c | 2 +- + include/linux/dcache.h | 4 ++-- + include/linux/nfs_xdr.h | 2 +- + kernel/sched/swait.c | 1 + 12 files changed, 31 insertions(+), 26 deletions(-) - -diff --git a/fs/afs/dir_silly.c b/fs/afs/dir_silly.c -index 04f75a44f243..60cbce1995a5 100644 +--- --- a/fs/afs/dir_silly.c +++ b/fs/afs/dir_silly.c -@@ -236,7 +236,7 @@ int afs_silly_iput(struct dentry *dentry, struct inode *inode) +@@ -239,7 +239,7 @@ int afs_silly_iput(struct dentry *dentry struct dentry *alias; int ret; @@ -36,11 +38,9 @@ index 04f75a44f243..60cbce1995a5 100644 _enter("%p{%pd},%llx", dentry, dentry, vnode->fid.vnode); -diff --git a/fs/cifs/readdir.c b/fs/cifs/readdir.c -index 799be3a5d25e..d5165a7da071 100644 --- a/fs/cifs/readdir.c +++ b/fs/cifs/readdir.c -@@ -81,7 +81,7 @@ cifs_prime_dcache(struct dentry *parent, struct qstr *name, +@@ -69,7 +69,7 @@ cifs_prime_dcache(struct dentry *parent, struct inode *inode; struct super_block *sb = parent->d_sb; struct cifs_sb_info *cifs_sb = CIFS_SB(sb); @@ -49,11 +49,9 @@ index 799be3a5d25e..d5165a7da071 100644 cifs_dbg(FYI, "%s: for %s\n", __func__, name->name); -diff --git a/fs/dcache.c b/fs/dcache.c -index ea0485861d93..1f4255ef8722 100644 --- a/fs/dcache.c +++ b/fs/dcache.c -@@ -2518,21 +2518,24 @@ static inline void end_dir_add(struct inode *dir, unsigned n) +@@ -2553,21 +2553,24 @@ static inline void end_dir_add(struct in static void d_wait_lookup(struct dentry *dentry) { @@ -89,7 +87,7 @@ index ea0485861d93..1f4255ef8722 100644 { unsigned int hash = name->hash; struct hlist_bl_head *b = in_lookup_hash(parent, hash); -@@ -2647,7 +2650,7 @@ void __d_lookup_done(struct dentry *dentry) +@@ -2682,7 +2685,7 @@ void __d_lookup_done(struct dentry *dent hlist_bl_lock(b); dentry->d_flags &= ~DCACHE_PAR_LOOKUP; __hlist_bl_del(&dentry->d_u.d_in_lookup_hash); @@ -98,11 +96,9 @@ index ea0485861d93..1f4255ef8722 100644 dentry->d_wait = NULL; hlist_bl_unlock(b); INIT_HLIST_NODE(&dentry->d_u.d_alias); -diff --git a/fs/fuse/readdir.c b/fs/fuse/readdir.c -index 3441ffa740f3..2fcae5cfd272 100644 --- a/fs/fuse/readdir.c +++ b/fs/fuse/readdir.c -@@ -158,7 +158,7 @@ static int fuse_direntplus_link(struct file *file, +@@ -158,7 +158,7 @@ static int fuse_direntplus_link(struct f struct inode *dir = d_inode(parent); struct fuse_conn *fc; struct inode *inode; @@ -111,11 +107,9 @@ index 3441ffa740f3..2fcae5cfd272 100644 if (!o->nodeid) { /* -diff --git a/fs/namei.c b/fs/namei.c -index 4c9d0c36545d..5a6e15a7eed7 100644 --- a/fs/namei.c +++ b/fs/namei.c -@@ -1520,7 +1520,7 @@ static struct dentry *__lookup_slow(const struct qstr *name, +@@ -1633,7 +1633,7 @@ static struct dentry *__lookup_slow(cons { struct dentry *dentry, *old; struct inode *inode = dir->d_inode; @@ -124,7 +118,7 @@ index 4c9d0c36545d..5a6e15a7eed7 100644 /* Don't go there if it's already dead */ if (unlikely(IS_DEADDIR(inode))) -@@ -3014,7 +3014,7 @@ static struct dentry *lookup_open(struct nameidata *nd, struct file *file, +@@ -3194,7 +3194,7 @@ static struct dentry *lookup_open(struct struct dentry *dentry; int error, create_error = 0; umode_t mode = op->mode; @@ -133,11 +127,9 @@ index 4c9d0c36545d..5a6e15a7eed7 100644 if (unlikely(IS_DEADDIR(dir_inode))) return ERR_PTR(-ENOENT); -diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c -index c837675cd395..a615b3961d89 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c -@@ -484,7 +484,7 @@ void nfs_prime_dcache(struct dentry *parent, struct nfs_entry *entry, +@@ -636,7 +636,7 @@ void nfs_prime_dcache(struct dentry *par unsigned long dir_verifier) { struct qstr filename = QSTR_INIT(entry->name, entry->len); @@ -146,7 +138,7 @@ index c837675cd395..a615b3961d89 100644 struct dentry *dentry; struct dentry *alias; struct inode *inode; -@@ -1671,7 +1671,7 @@ int nfs_atomic_open(struct inode *dir, struct dentry *dentry, +@@ -1876,7 +1876,7 @@ int nfs_atomic_open(struct inode *dir, s struct file *file, unsigned open_flags, umode_t mode) { @@ -155,8 +147,6 @@ index c837675cd395..a615b3961d89 100644 struct nfs_open_context *ctx; struct dentry *res; struct iattr attr = { .ia_valid = ATTR_OPEN }; -diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c -index b27ebdccef70..f86c98a7ed04 100644 --- a/fs/nfs/unlink.c +++ b/fs/nfs/unlink.c @@ -13,7 +13,7 @@ @@ -168,7 +158,7 @@ index b27ebdccef70..f86c98a7ed04 100644 #include <linux/namei.h> #include <linux/fsnotify.h> -@@ -180,7 +180,7 @@ nfs_async_unlink(struct dentry *dentry, const struct qstr *name) +@@ -180,7 +180,7 @@ nfs_async_unlink(struct dentry *dentry, data->cred = get_current_cred(); data->res.dir_attr = &data->dir_attr; @@ -177,19 +167,17 @@ index b27ebdccef70..f86c98a7ed04 100644 status = -EBUSY; spin_lock(&dentry->d_lock); -diff --git a/fs/proc/base.c b/fs/proc/base.c -index 55ce0ee9c5c7..a66f399476fc 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c -@@ -96,6 +96,7 @@ +@@ -95,6 +95,7 @@ #include <linux/posix-timers.h> #include <linux/time_namespace.h> #include <linux/resctrl.h> +#include <linux/swait.h> + #include <linux/cn_proc.h> #include <trace/events/oom.h> #include "internal.h" - #include "fd.h" -@@ -2038,7 +2039,7 @@ bool proc_fill_cache(struct file *file, struct dir_context *ctx, +@@ -2040,7 +2041,7 @@ bool proc_fill_cache(struct file *file, child = d_hash_and_lookup(dir, &qname); if (!child) { @@ -198,11 +186,9 @@ index 55ce0ee9c5c7..a66f399476fc 100644 child = d_alloc_parallel(dir, &qname, &wq); if (IS_ERR(child)) goto end_instantiate; -diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c -index 070d2df8ab9c..a1a964b631d7 100644 --- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c -@@ -683,7 +683,7 @@ static bool proc_sys_fill_cache(struct file *file, +@@ -678,7 +678,7 @@ static bool proc_sys_fill_cache(struct f child = d_lookup(dir, &qname); if (!child) { @@ -211,11 +197,9 @@ index 070d2df8ab9c..a1a964b631d7 100644 child = d_alloc_parallel(dir, &qname, &wq); if (IS_ERR(child)) return false; -diff --git a/include/linux/dcache.h b/include/linux/dcache.h -index 6f95c3300cbb..c1290db778bd 100644 --- a/include/linux/dcache.h +++ b/include/linux/dcache.h -@@ -106,7 +106,7 @@ struct dentry { +@@ -108,7 +108,7 @@ struct dentry { union { struct list_head d_lru; /* LRU list */ @@ -224,7 +208,7 @@ index 6f95c3300cbb..c1290db778bd 100644 }; struct list_head d_child; /* child of parent list */ struct list_head d_subdirs; /* our children */ -@@ -238,7 +238,7 @@ extern void d_set_d_op(struct dentry *dentry, const struct dentry_operations *op +@@ -240,7 +240,7 @@ extern void d_set_d_op(struct dentry *de extern struct dentry * d_alloc(struct dentry *, const struct qstr *); extern struct dentry * d_alloc_anon(struct super_block *); extern struct dentry * d_alloc_parallel(struct dentry *, const struct qstr *, @@ -233,11 +217,9 @@ index 6f95c3300cbb..c1290db778bd 100644 extern struct dentry * d_splice_alias(struct inode *, struct dentry *); extern struct dentry * d_add_ci(struct dentry *, struct inode *, struct qstr *); extern struct dentry * d_exact_alias(struct dentry *, struct inode *); -diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h -index d63cb862d58e..1630690ba709 100644 --- a/include/linux/nfs_xdr.h +++ b/include/linux/nfs_xdr.h -@@ -1670,7 +1670,7 @@ struct nfs_unlinkdata { +@@ -1692,7 +1692,7 @@ struct nfs_unlinkdata { struct nfs_removeargs args; struct nfs_removeres res; struct dentry *dentry; @@ -246,11 +228,9 @@ index d63cb862d58e..1630690ba709 100644 const struct cred *cred; struct nfs_fattr dir_attr; long timeout; -diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c -index e1c655f928c7..f230b1ac7f91 100644 --- a/kernel/sched/swait.c +++ b/kernel/sched/swait.c -@@ -64,6 +64,7 @@ void swake_up_all(struct swait_queue_head *q) +@@ -64,6 +64,7 @@ void swake_up_all(struct swait_queue_hea struct swait_queue *curr; LIST_HEAD(tmp); @@ -258,6 +238,3 @@ index e1c655f928c7..f230b1ac7f91 100644 raw_spin_lock_irq(&q->lock); list_splice_init(&q->task_list, &tmp); while (!list_empty(&tmp)) { --- -2.30.2 - diff --git a/debian/patches-rt/fscache-Use-only-one-fscache_object_cong_wait.patch b/debian/patches-rt/fscache-Use-only-one-fscache_object_cong_wait.patch new file mode 100644 index 000000000..4f47d8535 --- /dev/null +++ b/debian/patches-rt/fscache-Use-only-one-fscache_object_cong_wait.patch @@ -0,0 +1,123 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Thu, 28 Oct 2021 17:30:50 +0200 +Subject: [PATCH] fscache: Use only one fscache_object_cong_wait. +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +In the commit mentioned below, fscache was converted from slow-work to +workqueue. slow_work_enqueue() and slow_work_sleep_till_thread_needed() +did not use a per-CPU workqueue. They choose from two global waitqueues +depending on the SLOW_WORK_VERY_SLOW bit which was not set so it always +one waitqueue. + +I can't find out how it is ensured that a waiter on certain CPU is woken +up be the other side. My guess is that the timeout in schedule_timeout() +ensures that it does not wait forever (or a random wake up). + +fscache_object_sleep_till_congested() must be invoked from preemptible +context in order for schedule() to work. In this case this_cpu_ptr() +should complain with CONFIG_DEBUG_PREEMPT enabled except the thread is +bound to one CPU. + +wake_up() wakes only one waiter and I'm not sure if it is guaranteed +that only one waiter exists. + +Replace the per-CPU waitqueue with one global waitqueue. + +Fixes: 8b8edefa2fffb ("fscache: convert object to use workqueue instead of slow-work") +Reported-by: Gregor Beck <gregor.beck@gmail.com> +Cc: stable-rt@vger.kernel.org +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lkml.kernel.org/r/20211029083839.xwwt7jgzru3kcpii@linutronix.de +--- + fs/fscache/internal.h | 1 - + fs/fscache/main.c | 6 ------ + fs/fscache/object.c | 13 +++++-------- + 3 files changed, 5 insertions(+), 15 deletions(-) + +--- a/fs/fscache/internal.h ++++ b/fs/fscache/internal.h +@@ -81,7 +81,6 @@ extern unsigned fscache_debug; + extern struct kobject *fscache_root; + extern struct workqueue_struct *fscache_object_wq; + extern struct workqueue_struct *fscache_op_wq; +-DECLARE_PER_CPU(wait_queue_head_t, fscache_object_cong_wait); + + extern unsigned int fscache_hash(unsigned int salt, unsigned int *data, unsigned int n); + +--- a/fs/fscache/main.c ++++ b/fs/fscache/main.c +@@ -41,8 +41,6 @@ struct kobject *fscache_root; + struct workqueue_struct *fscache_object_wq; + struct workqueue_struct *fscache_op_wq; + +-DEFINE_PER_CPU(wait_queue_head_t, fscache_object_cong_wait); +- + /* these values serve as lower bounds, will be adjusted in fscache_init() */ + static unsigned fscache_object_max_active = 4; + static unsigned fscache_op_max_active = 2; +@@ -138,7 +136,6 @@ unsigned int fscache_hash(unsigned int s + static int __init fscache_init(void) + { + unsigned int nr_cpus = num_possible_cpus(); +- unsigned int cpu; + int ret; + + fscache_object_max_active = +@@ -161,9 +158,6 @@ static int __init fscache_init(void) + if (!fscache_op_wq) + goto error_op_wq; + +- for_each_possible_cpu(cpu) +- init_waitqueue_head(&per_cpu(fscache_object_cong_wait, cpu)); +- + ret = fscache_proc_init(); + if (ret < 0) + goto error_proc; +--- a/fs/fscache/object.c ++++ b/fs/fscache/object.c +@@ -798,6 +798,8 @@ void fscache_object_destroy(struct fscac + } + EXPORT_SYMBOL(fscache_object_destroy); + ++static DECLARE_WAIT_QUEUE_HEAD(fscache_object_cong_wait); ++ + /* + * enqueue an object for metadata-type processing + */ +@@ -806,16 +808,12 @@ void fscache_enqueue_object(struct fscac + _enter("{OBJ%x}", object->debug_id); + + if (fscache_get_object(object, fscache_obj_get_queue) >= 0) { +- wait_queue_head_t *cong_wq = +- &get_cpu_var(fscache_object_cong_wait); + + if (queue_work(fscache_object_wq, &object->work)) { + if (fscache_object_congested()) +- wake_up(cong_wq); ++ wake_up(&fscache_object_cong_wait); + } else + fscache_put_object(object, fscache_obj_put_queue); +- +- put_cpu_var(fscache_object_cong_wait); + } + } + +@@ -833,16 +831,15 @@ void fscache_enqueue_object(struct fscac + */ + bool fscache_object_sleep_till_congested(signed long *timeoutp) + { +- wait_queue_head_t *cong_wq = this_cpu_ptr(&fscache_object_cong_wait); + DEFINE_WAIT(wait); + + if (fscache_object_congested()) + return true; + +- add_wait_queue_exclusive(cong_wq, &wait); ++ add_wait_queue_exclusive(&fscache_object_cong_wait, &wait); + if (!fscache_object_congested()) + *timeoutp = schedule_timeout(*timeoutp); +- finish_wait(cong_wq, &wait); ++ finish_wait(&fscache_object_cong_wait, &wait); + + return fscache_object_congested(); + } diff --git a/debian/patches-rt/generic-softirq-Disable-softirq-stacks-on-PREEMPT_RT.patch b/debian/patches-rt/generic-softirq-Disable-softirq-stacks-on-PREEMPT_RT.patch new file mode 100644 index 000000000..c591dad36 --- /dev/null +++ b/debian/patches-rt/generic-softirq-Disable-softirq-stacks-on-PREEMPT_RT.patch @@ -0,0 +1,30 @@ +From: Thomas Gleixner <tglx@linutronix.de> +Date: Fri, 24 Sep 2021 17:05:48 +0200 +Subject: [PATCH] generic/softirq: Disable softirq stacks on PREEMPT_RT +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +PREEMPT_RT preempts softirqs and the current implementation avoids +do_softirq_own_stack() and only uses __do_softirq(). + +Disable the unused softirqs stacks on PREEMPT_RT to safe some memory and +ensure that do_softirq_own_stack() is not used which is not expected. + +[bigeasy: commit description.] + +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + include/asm-generic/softirq_stack.h | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/include/asm-generic/softirq_stack.h ++++ b/include/asm-generic/softirq_stack.h +@@ -2,7 +2,7 @@ + #ifndef __ASM_GENERIC_SOFTIRQ_STACK_H + #define __ASM_GENERIC_SOFTIRQ_STACK_H + +-#ifdef CONFIG_HAVE_SOFTIRQ_ON_OWN_STACK ++#if defined(CONFIG_HAVE_SOFTIRQ_ON_OWN_STACK) && !defined(CONFIG_PREEMPT_RT) + void do_softirq_own_stack(void); + #else + static inline void do_softirq_own_stack(void) diff --git a/debian/patches-rt/genirq-Disable-irqfixup-poll-on-PREEMPT_RT.patch b/debian/patches-rt/genirq-Disable-irqfixup-poll-on-PREEMPT_RT.patch new file mode 100644 index 000000000..5661b7a2a --- /dev/null +++ b/debian/patches-rt/genirq-Disable-irqfixup-poll-on-PREEMPT_RT.patch @@ -0,0 +1,49 @@ +From: Ingo Molnar <mingo@kernel.org> +Date: Fri, 3 Jul 2009 08:29:57 -0500 +Subject: [PATCH] genirq: Disable irqfixup/poll on PREEMPT_RT. +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The support for misrouted IRQs is used on old / legacy systems and is +not feasible on PREEMPT_RT. + +Polling for interrupts reduces the overall system performance. +Additionally the interrupt latency depends on the polling frequency and +delays are not desired for real time workloads. + +Disable IRQ polling on PREEMPT_RT and let the user know that it is not +enabled. The compiler will optimize the real fixup/poll code out. + +[ bigeasy: Update changelog and switch to IS_ENABLED() ] + +Signed-off-by: Ingo Molnar <mingo@kernel.org> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Link: https://lore.kernel.org/r/20210917223841.c6j6jcaffojrnot3@linutronix.de +--- + kernel/irq/spurious.c | 8 ++++++++ + 1 file changed, 8 insertions(+) + +--- a/kernel/irq/spurious.c ++++ b/kernel/irq/spurious.c +@@ -447,6 +447,10 @@ MODULE_PARM_DESC(noirqdebug, "Disable ir + + static int __init irqfixup_setup(char *str) + { ++ if (IS_ENABLED(CONFIG_PREEMPT_RT)) { ++ pr_warn("irqfixup boot option not supported with PREEMPT_RT\n"); ++ return 1; ++ } + irqfixup = 1; + printk(KERN_WARNING "Misrouted IRQ fixup support enabled.\n"); + printk(KERN_WARNING "This may impact system performance.\n"); +@@ -459,6 +463,10 @@ module_param(irqfixup, int, 0644); + + static int __init irqpoll_setup(char *str) + { ++ if (IS_ENABLED(CONFIG_PREEMPT_RT)) { ++ pr_warn("irqpoll boot option not supported with PREEMPT_RT\n"); ++ return 1; ++ } + irqfixup = 2; + printk(KERN_WARNING "Misrouted IRQ fixup and polling support " + "enabled\n"); diff --git a/debian/patches-rt/0070-genirq-Move-prio-assignment-into-the-newly-created-t.patch b/debian/patches-rt/genirq-Move-prio-assignment-into-the-newly-created-t.patch index b1669833a..4e2c06796 100644 --- a/debian/patches-rt/0070-genirq-Move-prio-assignment-into-the-newly-created-t.patch +++ b/debian/patches-rt/genirq-Move-prio-assignment-into-the-newly-created-t.patch @@ -1,12 +1,11 @@ -From 340d27d3eefc25d4e4d43159879ef65334d3d9ec Mon Sep 17 00:00:00 2001 From: Thomas Gleixner <tglx@linutronix.de> -Date: Mon, 9 Nov 2020 23:32:39 +0100 -Subject: [PATCH 070/296] genirq: Move prio assignment into the newly created - thread -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz +Date: Tue, 10 Nov 2020 12:38:48 +0100 +Subject: [PATCH] genirq: Move prio assignment into the newly created thread +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz With enabled threaded interrupts the nouveau driver reported the following: + | Chain exists of: | &mm->mmap_lock#2 --> &device->mutex --> &cpuset_rwsem | @@ -21,34 +20,33 @@ following: The device->mutex is nvkm_device::mutex. -Unblocking the lockchain at `cpuset_rwsem' is probably the easiest thing -to do. -Move the priority assignment to the start of the newly created thread. +Unblocking the lockchain at `cpuset_rwsem' is probably the easiest +thing to do. Move the priority assignment to the start of the newly +created thread. Fixes: 710da3c8ea7df ("sched/core: Prevent race condition between cpuset and __sched_setscheduler()") Reported-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [bigeasy: Patch description] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/a23a826af7c108ea5651e73b8fbae5e653f16e86.camel@gmx.de --- - kernel/irq/manage.c | 4 ++-- + kernel/irq/manage.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c -index 79dc02b956dc..0558f75c0b85 100644 --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c -@@ -1159,6 +1159,8 @@ static int irq_thread(void *data) +@@ -1259,6 +1259,8 @@ static int irq_thread(void *data) irqreturn_t (*handler_fn)(struct irq_desc *desc, struct irqaction *action); + sched_set_fifo(current); + - if (force_irqthreads && test_bit(IRQTF_FORCED_THREAD, - &action->thread_flags)) + if (force_irqthreads() && test_bit(IRQTF_FORCED_THREAD, + &action->thread_flags)) handler_fn = irq_forced_thread_fn; -@@ -1324,8 +1326,6 @@ setup_irq_thread(struct irqaction *new, unsigned int irq, bool secondary) +@@ -1424,8 +1426,6 @@ setup_irq_thread(struct irqaction *new, if (IS_ERR(t)) return PTR_ERR(t); @@ -57,6 +55,3 @@ index 79dc02b956dc..0558f75c0b85 100644 /* * We keep the reference to the task struct even if * the thread dies to avoid that the interrupt code --- -2.30.2 - diff --git a/debian/patches-rt/0276-genirq-update-irq_set_irqchip_state-documentation.patch b/debian/patches-rt/genirq__update_irq_set_irqchip_state_documentation.patch index 24bec707d..e32dd65df 100644 --- a/debian/patches-rt/0276-genirq-update-irq_set_irqchip_state-documentation.patch +++ b/debian/patches-rt/genirq__update_irq_set_irqchip_state_documentation.patch @@ -1,8 +1,9 @@ -From 7d892dd2e24d068a6582b36dc7f582a40e961649 Mon Sep 17 00:00:00 2001 +Subject: genirq: update irq_set_irqchip_state documentation +From: Josh Cartwright <joshc@ni.com> +Date: Thu Feb 11 11:54:00 2016 -0600 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Josh Cartwright <joshc@ni.com> -Date: Thu, 11 Feb 2016 11:54:00 -0600 -Subject: [PATCH 276/296] genirq: update irq_set_irqchip_state documentation -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz On -rt kernels, the use of migrate_disable()/migrate_enable() is sufficient to guarantee a task isn't moved to another CPU. Update the @@ -10,15 +11,14 @@ irq_set_irqchip_state() documentation to reflect this. Signed-off-by: Josh Cartwright <joshc@ni.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lkml.kernel.org/r/20210917103055.92150-1-bigeasy@linutronix.de --- - kernel/irq/manage.c | 2 +- + kernel/irq/manage.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c -index 8857ffa6e0e1..de00a0599afe 100644 +--- --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c -@@ -2721,7 +2721,7 @@ EXPORT_SYMBOL_GPL(irq_get_irqchip_state); +@@ -2833,7 +2833,7 @@ EXPORT_SYMBOL_GPL(irq_get_irqchip_state) * This call sets the internal irqchip state of an interrupt, * depending on the value of @which. * @@ -27,6 +27,3 @@ index 8857ffa6e0e1..de00a0599afe 100644 * interrupt controller has per-cpu registers. */ int irq_set_irqchip_state(unsigned int irq, enum irqchip_irq_state which, --- -2.30.2 - diff --git a/debian/patches-rt/irq_poll-Use-raise_softirq_irqoff-in-cpu_dead-notifi.patch b/debian/patches-rt/irq_poll-Use-raise_softirq_irqoff-in-cpu_dead-notifi.patch new file mode 100644 index 000000000..f6b54a578 --- /dev/null +++ b/debian/patches-rt/irq_poll-Use-raise_softirq_irqoff-in-cpu_dead-notifi.patch @@ -0,0 +1,36 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Thu, 2 Apr 2020 21:16:30 +0200 +Subject: [PATCH] irq_poll: Use raise_softirq_irqoff() in cpu_dead notifier +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +__raise_softirq_irqoff() adds a bit to the pending sofirq mask and this +is it. The softirq won't be handled in a deterministic way but randomly +when an interrupt fires and handles softirq in its irq_exit() routine or +if something randomly checks and handles pending softirqs in the call +chain before the CPU goes idle. + +Add a local_bh_disable/enable() around the IRQ-off section which will +handle pending softirqs. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lkml.kernel.org/r/20210930103754.2128949-1-bigeasy@linutronix.de +--- + lib/irq_poll.c | 2 ++ + 1 file changed, 2 insertions(+) + +--- a/lib/irq_poll.c ++++ b/lib/irq_poll.c +@@ -191,11 +191,13 @@ static int irq_poll_cpu_dead(unsigned in + * If a CPU goes away, splice its entries to the current CPU + * and trigger a run of the softirq + */ ++ local_bh_disable(); + local_irq_disable(); + list_splice_init(&per_cpu(blk_cpu_iopoll, cpu), + this_cpu_ptr(&blk_cpu_iopoll)); + __raise_softirq_irqoff(IRQ_POLL_SOFTIRQ); + local_irq_enable(); ++ local_bh_enable(); + + return 0; + } diff --git a/debian/patches-rt/0271-jump-label-disable-if-stop_machine-is-used.patch b/debian/patches-rt/jump-label__disable_if_stop_machine_is_used.patch index fd0504e9d..b1ae32470 100644 --- a/debian/patches-rt/0271-jump-label-disable-if-stop_machine-is-used.patch +++ b/debian/patches-rt/jump-label__disable_if_stop_machine_is_used.patch @@ -1,8 +1,9 @@ -From 596b66ebe32017e87a8deb3e68f5a8fc36256bb2 Mon Sep 17 00:00:00 2001 +Subject: jump-label: disable if stop_machine() is used +From: Thomas Gleixner <tglx@linutronix.de> +Date: Wed Jul 8 17:14:48 2015 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Thomas Gleixner <tglx@linutronix.de> -Date: Wed, 8 Jul 2015 17:14:48 +0200 -Subject: [PATCH 271/296] jump-label: disable if stop_machine() is used -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz Some architectures are using stop_machine() while switching the opcode which leads to latency spikes. @@ -20,23 +21,21 @@ The architecures which use other sorcery: Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [bigeasy: only ARM for now] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - arch/arm/Kconfig | 2 +- + arch/arm/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig -index 229a806a3dd7..07f43a0e8189 100644 +--- --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig -@@ -66,7 +66,7 @@ config ARM +@@ -68,7 +68,7 @@ config ARM select HARDIRQS_SW_RESEND select HAVE_ARCH_AUDITSYSCALL if AEABI && !OABI_COMPAT select HAVE_ARCH_BITREVERSE if (CPU_32v7M || CPU_32v7) && !CPU_32v6 - select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL && !CPU_ENDIAN_BE32 && MMU + select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL && !CPU_ENDIAN_BE32 && MMU && !PREEMPT_RT select HAVE_ARCH_KGDB if !CPU_ENDIAN_BE32 && MMU + select HAVE_ARCH_KASAN if MMU && !XIP_KERNEL select HAVE_ARCH_MMAP_RND_BITS if MMU - select HAVE_ARCH_SECCOMP --- -2.30.2 - diff --git a/debian/patches-rt/kdb__only_use_atomic_consoles_for_output_mirroring.patch b/debian/patches-rt/kdb__only_use_atomic_consoles_for_output_mirroring.patch new file mode 100644 index 000000000..4235b9133 --- /dev/null +++ b/debian/patches-rt/kdb__only_use_atomic_consoles_for_output_mirroring.patch @@ -0,0 +1,54 @@ +Subject: kdb: only use atomic consoles for output mirroring +From: John Ogness <john.ogness@linutronix.de> +Date: Fri Mar 19 14:57:31 2021 +0100 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +From: John Ogness <john.ogness@linutronix.de> + +Currently kdb uses the @oops_in_progress hack to mirror kdb output +to all active consoles from NMI context. Ignoring locks is unsafe. +Now that an NMI-safe atomic interfaces is available for consoles, +use that interface to mirror kdb output. + +Signed-off-by: John Ogness <john.ogness@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + + +--- + kernel/debug/kdb/kdb_io.c | 18 ++++++------------ + 1 file changed, 6 insertions(+), 12 deletions(-) +--- +--- a/kernel/debug/kdb/kdb_io.c ++++ b/kernel/debug/kdb/kdb_io.c +@@ -559,23 +559,17 @@ static void kdb_msg_write(const char *ms + cp++; + } + ++ /* mirror output on atomic consoles */ + for_each_console(c) { + if (!(c->flags & CON_ENABLED)) + continue; + if (c == dbg_io_ops->cons) + continue; +- /* +- * Set oops_in_progress to encourage the console drivers to +- * disregard their internal spin locks: in the current calling +- * context the risk of deadlock is a bigger problem than risks +- * due to re-entering the console driver. We operate directly on +- * oops_in_progress rather than using bust_spinlocks() because +- * the calls bust_spinlocks() makes on exit are not appropriate +- * for this calling context. +- */ +- ++oops_in_progress; +- c->write(c, msg, msg_len); +- --oops_in_progress; ++ ++ if (!c->write_atomic) ++ continue; ++ c->write_atomic(c, msg, msg_len); ++ + touch_nmi_watchdog(); + } + } diff --git a/debian/patches-rt/0228-kernel-sched-add-put-get-_cpu_light.patch b/debian/patches-rt/kernel_sched__add_putget_cpu_light.patch index e3e66aa97..b2d52feb2 100644 --- a/debian/patches-rt/0228-kernel-sched-add-put-get-_cpu_light.patch +++ b/debian/patches-rt/kernel_sched__add_putget_cpu_light.patch @@ -1,19 +1,21 @@ -From 2c3deb4bdaf70ba1cb1f37e046f6f07983ed2f31 Mon Sep 17 00:00:00 2001 +Subject: kernel/sched: add {put|get}_cpu_light() +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Sat May 27 19:02:06 2017 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Sat, 27 May 2017 19:02:06 +0200 -Subject: [PATCH 228/296] kernel/sched: add {put|get}_cpu_light() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - include/linux/smp.h | 3 +++ + include/linux/smp.h | 3 +++ 1 file changed, 3 insertions(+) - -diff --git a/include/linux/smp.h b/include/linux/smp.h -index 9f13966d3d92..c1f6aaade44a 100644 +--- --- a/include/linux/smp.h +++ b/include/linux/smp.h -@@ -239,6 +239,9 @@ static inline int get_boot_cpu_id(void) +@@ -268,6 +268,9 @@ static inline int get_boot_cpu_id(void) #define get_cpu() ({ preempt_disable(); __smp_processor_id(); }) #define put_cpu() preempt_enable() @@ -23,6 +25,3 @@ index 9f13966d3d92..c1f6aaade44a 100644 /* * Callback to arch code if there's nosmp or maxcpus=0 on the * boot command line: --- -2.30.2 - diff --git a/debian/patches-rt/0069-kthread-Move-prio-affinite-change-into-the-newly-cre.patch b/debian/patches-rt/kthread-Move-prio-affinite-change-into-the-newly-cre.patch index da8416ae4..22b674ddc 100644 --- a/debian/patches-rt/0069-kthread-Move-prio-affinite-change-into-the-newly-cre.patch +++ b/debian/patches-rt/kthread-Move-prio-affinite-change-into-the-newly-cre.patch @@ -1,12 +1,12 @@ -From 4edc70d10cbd81132e325f20c8f86e92009e0ab8 Mon Sep 17 00:00:00 2001 From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Mon, 9 Nov 2020 21:30:41 +0100 -Subject: [PATCH 069/296] kthread: Move prio/affinite change into the newly - created thread -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz +Date: Tue, 10 Nov 2020 12:38:47 +0100 +Subject: [PATCH] kthread: Move prio/affinite change into the newly created + thread +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz With enabled threaded interrupts the nouveau driver reported the following: + | Chain exists of: | &mm->mmap_lock#2 --> &device->mutex --> &cpuset_rwsem | @@ -21,23 +21,22 @@ following: The device->mutex is nvkm_device::mutex. -Unblocking the lockchain at `cpuset_rwsem' is probably the easiest thing -to do. -Move the priority reset to the start of the newly created thread. +Unblocking the lockchain at `cpuset_rwsem' is probably the easiest +thing to do. Move the priority reset to the start of the newly +created thread. Fixes: 710da3c8ea7df ("sched/core: Prevent race condition between cpuset and __sched_setscheduler()") Reported-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/a23a826af7c108ea5651e73b8fbae5e653f16e86.camel@gmx.de --- - kernel/kthread.c | 16 ++++++++-------- + kernel/kthread.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) -diff --git a/kernel/kthread.c b/kernel/kthread.c -index 5edf7e19ab26..cdfaf64263b3 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c -@@ -243,6 +243,7 @@ EXPORT_SYMBOL_GPL(kthread_parkme); +@@ -270,6 +270,7 @@ EXPORT_SYMBOL_GPL(kthread_parkme); static int kthread(void *_create) { @@ -45,7 +44,7 @@ index 5edf7e19ab26..cdfaf64263b3 100644 /* Copy data: it's on kthread's stack */ struct kthread_create_info *create = _create; int (*threadfn)(void *data) = create->threadfn; -@@ -273,6 +274,13 @@ static int kthread(void *_create) +@@ -300,6 +301,13 @@ static int kthread(void *_create) init_completion(&self->parked); current->vfork_done = &self->exited; @@ -59,7 +58,7 @@ index 5edf7e19ab26..cdfaf64263b3 100644 /* OK, tell user we're spawned, wait for stop or wakeup */ __set_current_state(TASK_UNINTERRUPTIBLE); create->result = current; -@@ -370,7 +378,6 @@ struct task_struct *__kthread_create_on_node(int (*threadfn)(void *data), +@@ -397,7 +405,6 @@ struct task_struct *__kthread_create_on_ } task = create->result; if (!IS_ERR(task)) { @@ -67,7 +66,7 @@ index 5edf7e19ab26..cdfaf64263b3 100644 char name[TASK_COMM_LEN]; /* -@@ -379,13 +386,6 @@ struct task_struct *__kthread_create_on_node(int (*threadfn)(void *data), +@@ -406,13 +413,6 @@ struct task_struct *__kthread_create_on_ */ vsnprintf(name, sizeof(name), namefmt, args); set_task_comm(task, name); @@ -81,6 +80,3 @@ index 5edf7e19ab26..cdfaf64263b3 100644 } kfree(create); return task; --- -2.30.2 - diff --git a/debian/patches-rt/leds-trigger-Disable-CPU-trigger-on-PREEMPT_RT.patch b/debian/patches-rt/leds-trigger-Disable-CPU-trigger-on-PREEMPT_RT.patch new file mode 100644 index 000000000..eab2fab46 --- /dev/null +++ b/debian/patches-rt/leds-trigger-Disable-CPU-trigger-on-PREEMPT_RT.patch @@ -0,0 +1,29 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Thu, 23 Jan 2014 14:45:59 +0100 +Subject: [PATCH] leds: trigger: Disable CPU trigger on PREEMPT_RT +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The CPU trigger is invoked on ARM from CPU-idle. That trigger later +invokes led_trigger_event() which may invoke the callback of the actual driver. +That driver can acquire a spinlock_t which is okay on kernel without +PREEMPT_RT. On PREEMPT_RT enabled kernel this lock is turned into a sleeping +lock and must not be acquired with disabled interrupts. + +Disable the CPU trigger on PREEMPT_RT. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lkml.kernel.org/r/20210924111501.m57cwwn7ahiyxxdd@linutronix.de +--- + drivers/leds/trigger/Kconfig | 1 + + 1 file changed, 1 insertion(+) + +--- a/drivers/leds/trigger/Kconfig ++++ b/drivers/leds/trigger/Kconfig +@@ -64,6 +64,7 @@ config LEDS_TRIGGER_BACKLIGHT + + config LEDS_TRIGGER_CPU + bool "LED CPU Trigger" ++ depends on !PREEMPT_RT + help + This allows LEDs to be controlled by active CPUs. This shows + the active CPUs across an array of LEDs so you can see which diff --git a/debian/patches-rt/lockdep-selftests-Avoid-using-local_lock_-acquire-re.patch b/debian/patches-rt/lockdep-selftests-Avoid-using-local_lock_-acquire-re.patch new file mode 100644 index 000000000..85080c3a1 --- /dev/null +++ b/debian/patches-rt/lockdep-selftests-Avoid-using-local_lock_-acquire-re.patch @@ -0,0 +1,116 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Fri, 13 Aug 2021 18:26:10 +0200 +Subject: [PATCH] lockdep/selftests: Avoid using + local_lock_{acquire|release}(). +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The functions local_lock related functions + local_lock_acquire() + local_lock_release() + +are part of the internal implementation and should be avoided. +Define the lock as DEFINE_PER_CPU so the normal local_lock() function +can be used. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + lib/locking-selftest.c | 30 +++++++++++++++--------------- + 1 file changed, 15 insertions(+), 15 deletions(-) + +--- a/lib/locking-selftest.c ++++ b/lib/locking-selftest.c +@@ -139,7 +139,7 @@ static DEFINE_RT_MUTEX(rtmutex_Z2); + + #endif + +-static local_lock_t local_A = INIT_LOCAL_LOCK(local_A); ++static DEFINE_PER_CPU(local_lock_t, local_A); + + /* + * non-inlined runtime initializers, to let separate locks share +@@ -1320,7 +1320,7 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_ + # define I_MUTEX(x) lockdep_reset_lock(&mutex_##x.dep_map) + # define I_RWSEM(x) lockdep_reset_lock(&rwsem_##x.dep_map) + # define I_WW(x) lockdep_reset_lock(&x.dep_map) +-# define I_LOCAL_LOCK(x) lockdep_reset_lock(&local_##x.dep_map) ++# define I_LOCAL_LOCK(x) lockdep_reset_lock(this_cpu_ptr(&local_##x.dep_map)) + #ifdef CONFIG_RT_MUTEXES + # define I_RTMUTEX(x) lockdep_reset_lock(&rtmutex_##x.dep_map) + #endif +@@ -1380,7 +1380,7 @@ static void reset_locks(void) + init_shared_classes(); + raw_spin_lock_init(&raw_lock_A); + raw_spin_lock_init(&raw_lock_B); +- local_lock_init(&local_A); ++ local_lock_init(this_cpu_ptr(&local_A)); + + ww_mutex_init(&o, &ww_lockdep); ww_mutex_init(&o2, &ww_lockdep); ww_mutex_init(&o3, &ww_lockdep); + memset(&t, 0, sizeof(t)); memset(&t2, 0, sizeof(t2)); +@@ -2646,8 +2646,8 @@ static void wait_context_tests(void) + + static void local_lock_2(void) + { +- local_lock_acquire(&local_A); /* IRQ-ON */ +- local_lock_release(&local_A); ++ local_lock(&local_A); /* IRQ-ON */ ++ local_unlock(&local_A); + + HARDIRQ_ENTER(); + spin_lock(&lock_A); /* IN-IRQ */ +@@ -2656,18 +2656,18 @@ static void local_lock_2(void) + + HARDIRQ_DISABLE(); + spin_lock(&lock_A); +- local_lock_acquire(&local_A); /* IN-IRQ <-> IRQ-ON cycle, false */ +- local_lock_release(&local_A); ++ local_lock(&local_A); /* IN-IRQ <-> IRQ-ON cycle, false */ ++ local_unlock(&local_A); + spin_unlock(&lock_A); + HARDIRQ_ENABLE(); + } + + static void local_lock_3A(void) + { +- local_lock_acquire(&local_A); /* IRQ-ON */ ++ local_lock(&local_A); /* IRQ-ON */ + spin_lock(&lock_B); /* IRQ-ON */ + spin_unlock(&lock_B); +- local_lock_release(&local_A); ++ local_unlock(&local_A); + + HARDIRQ_ENTER(); + spin_lock(&lock_A); /* IN-IRQ */ +@@ -2676,18 +2676,18 @@ static void local_lock_3A(void) + + HARDIRQ_DISABLE(); + spin_lock(&lock_A); +- local_lock_acquire(&local_A); /* IN-IRQ <-> IRQ-ON cycle only if we count local_lock(), false */ +- local_lock_release(&local_A); ++ local_lock(&local_A); /* IN-IRQ <-> IRQ-ON cycle only if we count local_lock(), false */ ++ local_unlock(&local_A); + spin_unlock(&lock_A); + HARDIRQ_ENABLE(); + } + + static void local_lock_3B(void) + { +- local_lock_acquire(&local_A); /* IRQ-ON */ ++ local_lock(&local_A); /* IRQ-ON */ + spin_lock(&lock_B); /* IRQ-ON */ + spin_unlock(&lock_B); +- local_lock_release(&local_A); ++ local_unlock(&local_A); + + HARDIRQ_ENTER(); + spin_lock(&lock_A); /* IN-IRQ */ +@@ -2696,8 +2696,8 @@ static void local_lock_3B(void) + + HARDIRQ_DISABLE(); + spin_lock(&lock_A); +- local_lock_acquire(&local_A); /* IN-IRQ <-> IRQ-ON cycle only if we count local_lock(), false */ +- local_lock_release(&local_A); ++ local_lock(&local_A); /* IN-IRQ <-> IRQ-ON cycle only if we count local_lock(), false */ ++ local_unlock(&local_A); + spin_unlock(&lock_A); + HARDIRQ_ENABLE(); + diff --git a/debian/patches-rt/locking-Allow-to-include-asm-spinlock_types.h-from-l.patch b/debian/patches-rt/locking-Allow-to-include-asm-spinlock_types.h-from-l.patch new file mode 100644 index 000000000..7bf7c3ec2 --- /dev/null +++ b/debian/patches-rt/locking-Allow-to-include-asm-spinlock_types.h-from-l.patch @@ -0,0 +1,266 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Tue, 17 Aug 2021 09:48:31 +0200 +Subject: [PATCH] locking: Allow to include asm/spinlock_types.h from + linux/spinlock_types_raw.h +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The printk header file includes ratelimit_types.h for its __ratelimit() +based usage. It requires it for the static initializer used in +printk_ratelimited(). It uses a raw_spinlock_t and includes the +spinlock_types.h. It makes no difference on non PREEMPT-RT builds but +PREEMPT-RT replaces the inner part of some locks and therefore includes +rtmutex.h and atomic.h which leads to recursive includes where defines +are missing. +By including only the raw_spinlock_t defines it avoids the atomic.h +related includes at this stage. + +An example on powerpc: + +| CALL scripts/atomic/check-atomics.sh +|In file included from include/linux/bug.h:5, +| from include/linux/page-flags.h:10, +| from kernel/bounds.c:10: +|arch/powerpc/include/asm/page_32.h: In function ‘clear_page’: +|arch/powerpc/include/asm/bug.h:87:4: error: implicit declaration of function ‘__WARN’ [-Werror=implicit-function-declaration] +| 87 | __WARN(); \ +| | ^~~~~~ +|arch/powerpc/include/asm/page_32.h:48:2: note: in expansion of macro ‘WARN_ON’ +| 48 | WARN_ON((unsigned long)addr & (L1_CACHE_BYTES - 1)); +| | ^~~~~~~ +|arch/powerpc/include/asm/bug.h:58:17: error: invalid application of ‘sizeof’ to incomplete type ‘struct bug_entry’ +| 58 | "i" (sizeof(struct bug_entry)), \ +| | ^~~~~~ +|arch/powerpc/include/asm/bug.h:89:3: note: in expansion of macro ‘BUG_ENTRY’ +| 89 | BUG_ENTRY(PPC_TLNEI " %4, 0", \ +| | ^~~~~~~~~ +|arch/powerpc/include/asm/page_32.h:48:2: note: in expansion of macro ‘WARN_ON’ +| 48 | WARN_ON((unsigned long)addr & (L1_CACHE_BYTES - 1)); +| | ^~~~~~~ +|In file included from arch/powerpc/include/asm/ptrace.h:298, +| from arch/powerpc/include/asm/hw_irq.h:12, +| from arch/powerpc/include/asm/irqflags.h:12, +| from include/linux/irqflags.h:16, +| from include/asm-generic/cmpxchg-local.h:6, +| from arch/powerpc/include/asm/cmpxchg.h:526, +| from arch/powerpc/include/asm/atomic.h:11, +| from include/linux/atomic.h:7, +| from include/linux/rwbase_rt.h:6, +| from include/linux/rwlock_types.h:55, +| from include/linux/spinlock_types.h:74, +| from include/linux/ratelimit_types.h:7, +| from include/linux/printk.h:10, +| from include/asm-generic/bug.h:22, +| from arch/powerpc/include/asm/bug.h:109, +| from include/linux/bug.h:5, +| from include/linux/page-flags.h:10, +| from kernel/bounds.c:10: +|include/linux/thread_info.h: In function ‘copy_overflow’: +|include/linux/thread_info.h:210:2: error: implicit declaration of function ‘WARN’ [-Werror=implicit-function-declaration] +| 210 | WARN(1, "Buffer overflow detected (%d < %lu)!\n", size, count); +| | ^~~~ + +The WARN / BUG include pulls in printk.h and then ptrace.h expects WARN +(from bug.h) which is not yet complete. Even hw_irq.h has WARN_ON() +statements. + +On POWERPC64 there are missing atomic64 defines while building 32bit +VDSO: +| VDSO32C arch/powerpc/kernel/vdso32/vgettimeofday.o +|In file included from include/linux/atomic.h:80, +| from include/linux/rwbase_rt.h:6, +| from include/linux/rwlock_types.h:55, +| from include/linux/spinlock_types.h:74, +| from include/linux/ratelimit_types.h:7, +| from include/linux/printk.h:10, +| from include/linux/kernel.h:19, +| from arch/powerpc/include/asm/page.h:11, +| from arch/powerpc/include/asm/vdso/gettimeofday.h:5, +| from include/vdso/datapage.h:137, +| from lib/vdso/gettimeofday.c:5, +| from <command-line>: +|include/linux/atomic-arch-fallback.h: In function ‘arch_atomic64_inc’: +|include/linux/atomic-arch-fallback.h:1447:2: error: implicit declaration of function ‘arch_atomic64_add’; did you mean ‘arch_atomic_add’? [-Werror=impl +|icit-function-declaration] +| 1447 | arch_atomic64_add(1, v); +| | ^~~~~~~~~~~~~~~~~ +| | arch_atomic_add + +The generic fallback is not included, atomics itself are not used. If +kernel.h does not include printk.h then it comes later from the bug.h +include. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + arch/alpha/include/asm/spinlock_types.h | 2 +- + arch/arm/include/asm/spinlock_types.h | 2 +- + arch/arm64/include/asm/spinlock_types.h | 2 +- + arch/csky/include/asm/spinlock_types.h | 2 +- + arch/hexagon/include/asm/spinlock_types.h | 2 +- + arch/ia64/include/asm/spinlock_types.h | 2 +- + arch/powerpc/include/asm/simple_spinlock_types.h | 2 +- + arch/powerpc/include/asm/spinlock_types.h | 2 +- + arch/riscv/include/asm/spinlock_types.h | 2 +- + arch/s390/include/asm/spinlock_types.h | 2 +- + arch/sh/include/asm/spinlock_types.h | 2 +- + arch/xtensa/include/asm/spinlock_types.h | 2 +- + include/linux/ratelimit_types.h | 2 +- + include/linux/spinlock_types_up.h | 2 +- + 14 files changed, 14 insertions(+), 14 deletions(-) + +--- a/arch/alpha/include/asm/spinlock_types.h ++++ b/arch/alpha/include/asm/spinlock_types.h +@@ -2,7 +2,7 @@ + #ifndef _ALPHA_SPINLOCK_TYPES_H + #define _ALPHA_SPINLOCK_TYPES_H + +-#ifndef __LINUX_SPINLOCK_TYPES_H ++#ifndef __LINUX_SPINLOCK_TYPES_RAW_H + # error "please don't include this file directly" + #endif + +--- a/arch/arm/include/asm/spinlock_types.h ++++ b/arch/arm/include/asm/spinlock_types.h +@@ -2,7 +2,7 @@ + #ifndef __ASM_SPINLOCK_TYPES_H + #define __ASM_SPINLOCK_TYPES_H + +-#ifndef __LINUX_SPINLOCK_TYPES_H ++#ifndef __LINUX_SPINLOCK_TYPES_RAW_H + # error "please don't include this file directly" + #endif + +--- a/arch/arm64/include/asm/spinlock_types.h ++++ b/arch/arm64/include/asm/spinlock_types.h +@@ -5,7 +5,7 @@ + #ifndef __ASM_SPINLOCK_TYPES_H + #define __ASM_SPINLOCK_TYPES_H + +-#if !defined(__LINUX_SPINLOCK_TYPES_H) && !defined(__ASM_SPINLOCK_H) ++#if !defined(__LINUX_SPINLOCK_TYPES_RAW_H) && !defined(__ASM_SPINLOCK_H) + # error "please don't include this file directly" + #endif + +--- a/arch/csky/include/asm/spinlock_types.h ++++ b/arch/csky/include/asm/spinlock_types.h +@@ -3,7 +3,7 @@ + #ifndef __ASM_CSKY_SPINLOCK_TYPES_H + #define __ASM_CSKY_SPINLOCK_TYPES_H + +-#ifndef __LINUX_SPINLOCK_TYPES_H ++#ifndef __LINUX_SPINLOCK_TYPES_RAW_H + # error "please don't include this file directly" + #endif + +--- a/arch/hexagon/include/asm/spinlock_types.h ++++ b/arch/hexagon/include/asm/spinlock_types.h +@@ -8,7 +8,7 @@ + #ifndef _ASM_SPINLOCK_TYPES_H + #define _ASM_SPINLOCK_TYPES_H + +-#ifndef __LINUX_SPINLOCK_TYPES_H ++#ifndef __LINUX_SPINLOCK_TYPES_RAW_H + # error "please don't include this file directly" + #endif + +--- a/arch/ia64/include/asm/spinlock_types.h ++++ b/arch/ia64/include/asm/spinlock_types.h +@@ -2,7 +2,7 @@ + #ifndef _ASM_IA64_SPINLOCK_TYPES_H + #define _ASM_IA64_SPINLOCK_TYPES_H + +-#ifndef __LINUX_SPINLOCK_TYPES_H ++#ifndef __LINUX_SPINLOCK_TYPES_RAW_H + # error "please don't include this file directly" + #endif + +--- a/arch/powerpc/include/asm/simple_spinlock_types.h ++++ b/arch/powerpc/include/asm/simple_spinlock_types.h +@@ -2,7 +2,7 @@ + #ifndef _ASM_POWERPC_SIMPLE_SPINLOCK_TYPES_H + #define _ASM_POWERPC_SIMPLE_SPINLOCK_TYPES_H + +-#ifndef __LINUX_SPINLOCK_TYPES_H ++#ifndef __LINUX_SPINLOCK_TYPES_RAW_H + # error "please don't include this file directly" + #endif + +--- a/arch/powerpc/include/asm/spinlock_types.h ++++ b/arch/powerpc/include/asm/spinlock_types.h +@@ -2,7 +2,7 @@ + #ifndef _ASM_POWERPC_SPINLOCK_TYPES_H + #define _ASM_POWERPC_SPINLOCK_TYPES_H + +-#ifndef __LINUX_SPINLOCK_TYPES_H ++#ifndef __LINUX_SPINLOCK_TYPES_RAW_H + # error "please don't include this file directly" + #endif + +--- a/arch/riscv/include/asm/spinlock_types.h ++++ b/arch/riscv/include/asm/spinlock_types.h +@@ -6,7 +6,7 @@ + #ifndef _ASM_RISCV_SPINLOCK_TYPES_H + #define _ASM_RISCV_SPINLOCK_TYPES_H + +-#ifndef __LINUX_SPINLOCK_TYPES_H ++#ifndef __LINUX_SPINLOCK_TYPES_RAW_H + # error "please don't include this file directly" + #endif + +--- a/arch/s390/include/asm/spinlock_types.h ++++ b/arch/s390/include/asm/spinlock_types.h +@@ -2,7 +2,7 @@ + #ifndef __ASM_SPINLOCK_TYPES_H + #define __ASM_SPINLOCK_TYPES_H + +-#ifndef __LINUX_SPINLOCK_TYPES_H ++#ifndef __LINUX_SPINLOCK_TYPES_RAW_H + # error "please don't include this file directly" + #endif + +--- a/arch/sh/include/asm/spinlock_types.h ++++ b/arch/sh/include/asm/spinlock_types.h +@@ -2,7 +2,7 @@ + #ifndef __ASM_SH_SPINLOCK_TYPES_H + #define __ASM_SH_SPINLOCK_TYPES_H + +-#ifndef __LINUX_SPINLOCK_TYPES_H ++#ifndef __LINUX_SPINLOCK_TYPES_RAW_H + # error "please don't include this file directly" + #endif + +--- a/arch/xtensa/include/asm/spinlock_types.h ++++ b/arch/xtensa/include/asm/spinlock_types.h +@@ -2,7 +2,7 @@ + #ifndef __ASM_SPINLOCK_TYPES_H + #define __ASM_SPINLOCK_TYPES_H + +-#if !defined(__LINUX_SPINLOCK_TYPES_H) && !defined(__ASM_SPINLOCK_H) ++#if !defined(__LINUX_SPINLOCK_TYPES_RAW_H) && !defined(__ASM_SPINLOCK_H) + # error "please don't include this file directly" + #endif + +--- a/include/linux/ratelimit_types.h ++++ b/include/linux/ratelimit_types.h +@@ -4,7 +4,7 @@ + + #include <linux/bits.h> + #include <linux/param.h> +-#include <linux/spinlock_types.h> ++#include <linux/spinlock_types_raw.h> + + #define DEFAULT_RATELIMIT_INTERVAL (5 * HZ) + #define DEFAULT_RATELIMIT_BURST 10 +--- a/include/linux/spinlock_types_up.h ++++ b/include/linux/spinlock_types_up.h +@@ -1,7 +1,7 @@ + #ifndef __LINUX_SPINLOCK_TYPES_UP_H + #define __LINUX_SPINLOCK_TYPES_UP_H + +-#ifndef __LINUX_SPINLOCK_TYPES_H ++#ifndef __LINUX_SPINLOCK_TYPES_RAW_H + # error "please don't include this file directly" + #endif + diff --git a/debian/patches-rt/locking-Remove-rt_rwlock_is_contended.patch b/debian/patches-rt/locking-Remove-rt_rwlock_is_contended.patch new file mode 100644 index 000000000..43b227a54 --- /dev/null +++ b/debian/patches-rt/locking-Remove-rt_rwlock_is_contended.patch @@ -0,0 +1,34 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Tue, 7 Sep 2021 12:11:47 +0200 +Subject: [PATCH] locking: Remove rt_rwlock_is_contended() +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +rt_rwlock_is_contended() has not users. It makes no sense to use it as +rwlock_is_contended() because it is a sleeping lock on RT and preemption +is possible. It reports always != 0 if used by a writer and even if +there is a waiter then the lock might not be handed over if the +current owner has the highest priority. + +Remove rt_rwlock_is_contended(). + +Reported-by: kernel test robot <lkp@intel.com> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + kernel/locking/spinlock_rt.c | 6 ------ + 1 file changed, 6 deletions(-) + +--- a/kernel/locking/spinlock_rt.c ++++ b/kernel/locking/spinlock_rt.c +@@ -257,12 +257,6 @@ void __sched rt_write_unlock(rwlock_t *r + } + EXPORT_SYMBOL(rt_write_unlock); + +-int __sched rt_rwlock_is_contended(rwlock_t *rwlock) +-{ +- return rw_base_is_contended(&rwlock->rwbase); +-} +-EXPORT_SYMBOL(rt_rwlock_is_contended); +- + #ifdef CONFIG_DEBUG_LOCK_ALLOC + void __rt_rwlock_init(rwlock_t *rwlock, const char *name, + struct lock_class_key *key) diff --git a/debian/patches-rt/0235-md-raid5-Make-raid5_percpu-handling-RT-aware.patch b/debian/patches-rt/md__raid5__Make_raid5_percpu_handling_RT_aware.patch index 9127a6e63..dbe6fdd11 100644 --- a/debian/patches-rt/0235-md-raid5-Make-raid5_percpu-handling-RT-aware.patch +++ b/debian/patches-rt/md__raid5__Make_raid5_percpu_handling_RT_aware.patch @@ -1,8 +1,9 @@ -From fdb7bb7df80b1e3bc77a057a67abe9640fc9b1e3 Mon Sep 17 00:00:00 2001 +Subject: md: raid5: Make raid5_percpu handling RT aware +From: Thomas Gleixner <tglx@linutronix.de> +Date: Tue Apr 6 16:51:31 2010 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 6 Apr 2010 16:51:31 +0200 -Subject: [PATCH 235/296] md: raid5: Make raid5_percpu handling RT aware -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz __raid_run_ops() disables preemption with get_cpu() around the access to the raid5_percpu variables. That causes scheduling while atomic @@ -14,16 +15,17 @@ preemptible. Reported-by: Udo van den Heuvel <udovdh@xs4all.nl> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Udo van den Heuvel <udovdh@xs4all.nl> + + + --- - drivers/md/raid5.c | 7 +++++-- - drivers/md/raid5.h | 1 + + drivers/md/raid5.c | 7 +++++-- + drivers/md/raid5.h | 1 + 2 files changed, 6 insertions(+), 2 deletions(-) - -diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c -index 39343479ac2a..6b53816e71c9 100644 +--- --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c -@@ -2216,8 +2216,9 @@ static void raid_run_ops(struct stripe_head *sh, unsigned long ops_request) +@@ -2217,8 +2217,9 @@ static void raid_run_ops(struct stripe_h struct raid5_percpu *percpu; unsigned long cpu; @@ -34,7 +36,7 @@ index 39343479ac2a..6b53816e71c9 100644 if (test_bit(STRIPE_OP_BIOFILL, &ops_request)) { ops_run_biofill(sh); overlap_clear++; -@@ -2276,7 +2277,8 @@ static void raid_run_ops(struct stripe_head *sh, unsigned long ops_request) +@@ -2277,7 +2278,8 @@ static void raid_run_ops(struct stripe_h if (test_and_clear_bit(R5_Overlap, &dev->flags)) wake_up(&sh->raid_conf->wait_for_overlap); } @@ -44,7 +46,7 @@ index 39343479ac2a..6b53816e71c9 100644 } static void free_stripe(struct kmem_cache *sc, struct stripe_head *sh) -@@ -7098,6 +7100,7 @@ static int raid456_cpu_up_prepare(unsigned int cpu, struct hlist_node *node) +@@ -7102,6 +7104,7 @@ static int raid456_cpu_up_prepare(unsign __func__, cpu); return -ENOMEM; } @@ -52,8 +54,6 @@ index 39343479ac2a..6b53816e71c9 100644 return 0; } -diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h -index 5c05acf20e1f..665fe138ab4f 100644 --- a/drivers/md/raid5.h +++ b/drivers/md/raid5.h @@ -635,6 +635,7 @@ struct r5conf { @@ -64,6 +64,3 @@ index 5c05acf20e1f..665fe138ab4f 100644 struct page *spare_page; /* Used when checking P/Q in raid6 */ void *scribble; /* space for constructing buffer * lists and performing address --- -2.30.2 - diff --git a/debian/patches-rt/mm-Disable-NUMA_BALANCING_DEFAULT_ENABLED-and-TRANSP.patch b/debian/patches-rt/mm-Disable-NUMA_BALANCING_DEFAULT_ENABLED-and-TRANSP.patch new file mode 100644 index 000000000..699117098 --- /dev/null +++ b/debian/patches-rt/mm-Disable-NUMA_BALANCING_DEFAULT_ENABLED-and-TRANSP.patch @@ -0,0 +1,55 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Thu, 28 Oct 2021 16:33:27 +0200 +Subject: [PATCH] mm: Disable NUMA_BALANCING_DEFAULT_ENABLED and + TRANSPARENT_HUGEPAGE on PREEMPT_RT +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +TRANSPARENT_HUGEPAGE: +There are potential non-deterministic delays to an RT thread if a critical +memory region is not THP-aligned and a non-RT buffer is located in the same +hugepage-aligned region. It's also possible for an unrelated thread to migrate +pages belonging to an RT task incurring unexpected page faults due to memory +defragmentation even if khugepaged is disabled. + +Regular HUGEPAGEs are not affected by this can be used. + +NUMA_BALANCING: +There is a non-deterministic delay to mark PTEs PROT_NONE to gather NUMA fault +samples, increased page faults of regions even if mlocked and non-deterministic +delays when migrating pages. + +[Mel Gorman worded 99% of the commit description]. + +Link: https://lore.kernel.org/all/20200304091159.GN3818@techsingularity.net/ +Link: https://lore.kernel.org/all/20211026165100.ahz5bkx44lrrw5pt@linutronix.de/ +Cc: Mel Gorman <mgorman@techsingularity.net> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Acked-by: Mel Gorman <mgorman@techsingularity.net> +Link: https://lore.kernel.org/r/20211028143327.hfbxjze7palrpfgp@linutronix.de +--- + init/Kconfig | 2 +- + mm/Kconfig | 2 +- + 2 files changed, 2 insertions(+), 2 deletions(-) + +--- a/init/Kconfig ++++ b/init/Kconfig +@@ -901,7 +901,7 @@ config NUMA_BALANCING + bool "Memory placement aware NUMA scheduler" + depends on ARCH_SUPPORTS_NUMA_BALANCING + depends on !ARCH_WANT_NUMA_VARIABLE_LOCALITY +- depends on SMP && NUMA && MIGRATION ++ depends on SMP && NUMA && MIGRATION && !PREEMPT_RT + help + This option adds support for automatic NUMA aware memory/task placement. + The mechanism is quite primitive and is based on migrating memory when +--- a/mm/Kconfig ++++ b/mm/Kconfig +@@ -371,7 +371,7 @@ config NOMMU_INITIAL_TRIM_EXCESS + + config TRANSPARENT_HUGEPAGE + bool "Transparent Hugepage Support" +- depends on HAVE_ARCH_TRANSPARENT_HUGEPAGE ++ depends on HAVE_ARCH_TRANSPARENT_HUGEPAGE && !PREEMPT_RT + select COMPACTION + select XARRAY_MULTI + help diff --git a/debian/patches-rt/mm-Disable-zsmalloc-on-PREEMPT_RT.patch b/debian/patches-rt/mm-Disable-zsmalloc-on-PREEMPT_RT.patch new file mode 100644 index 000000000..8b4b653f0 --- /dev/null +++ b/debian/patches-rt/mm-Disable-zsmalloc-on-PREEMPT_RT.patch @@ -0,0 +1,47 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Thu, 23 Sep 2021 15:51:48 +0200 +Subject: [PATCH] mm: Disable zsmalloc on PREEMPT_RT +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +For efficiency reasons, zsmalloc is using a slim `handle'. The value is +the address of a memory allocation of 4 or 8 bytes depending on the size +of the long data type. The lowest bit in that allocated memory is used +as a bit spin lock. +The usage of the bit spin lock is problematic because with the bit spin +lock held zsmalloc acquires a rwlock_t and spinlock_t which are both +sleeping locks on PREEMPT_RT and therefore must not be acquired with +disabled preemption. + +There is a patch which extends the handle on PREEMPT_RT so that a full +spinlock_t fits (even with lockdep enabled) and then eliminates the bit +spin lock. I'm not sure how sensible zsmalloc on PREEMPT_RT is given +that it is used to store compressed user memory. + +Disable ZSMALLOC on PREEMPT_RT. If there is need for it, we can try to +get it to work. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lkml.kernel.org/r/20210923170121.1860133-1-bigeasy@linutronix.de +--- + mm/Kconfig | 3 ++- + 1 file changed, 2 insertions(+), 1 deletion(-) + +--- a/mm/Kconfig ++++ b/mm/Kconfig +@@ -640,6 +640,7 @@ config ZSWAP_ZPOOL_DEFAULT_Z3FOLD + + config ZSWAP_ZPOOL_DEFAULT_ZSMALLOC + bool "zsmalloc" ++ depends on !PREEMPT_RT + select ZSMALLOC + help + Use the zsmalloc allocator as the default allocator. +@@ -690,7 +691,7 @@ config Z3FOLD + + config ZSMALLOC + tristate "Memory allocator for compressed pages" +- depends on MMU ++ depends on MMU && !PREEMPT_RT + help + zsmalloc is a slab-based memory allocator designed to store + compressed RAM pages. zsmalloc uses virtual memory mapping diff --git a/debian/patches-rt/mm-memcontro--Disable-on-PREEMPT_RT.patch b/debian/patches-rt/mm-memcontro--Disable-on-PREEMPT_RT.patch new file mode 100644 index 000000000..c25641819 --- /dev/null +++ b/debian/patches-rt/mm-memcontro--Disable-on-PREEMPT_RT.patch @@ -0,0 +1,26 @@ +Subject: mm/memcontrol: Disable on PREEMPT_RT +From: Thomas Gleixner <tglx@linutronix.de> +Date: Sun, 25 Jul 2021 21:35:46 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +559271146efc ("mm/memcg: optimize user context object stock access") is a +classic example of optimizing for the cpu local BKL serialization without a +clear protection scope. + +Disable MEMCG on RT for now. + +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +--- + init/Kconfig | 1 + + 1 file changed, 1 insertion(+) + +--- a/init/Kconfig ++++ b/init/Kconfig +@@ -938,6 +938,7 @@ config PAGE_COUNTER + + config MEMCG + bool "Memory controller" ++ depends on !PREEMPT_RT + select PAGE_COUNTER + select EVENTFD + help diff --git a/debian/patches-rt/0213-mm-zsmalloc-copy-with-get_cpu_var-and-locking.patch b/debian/patches-rt/mm-zsmalloc-Replace-bit-spinlock-and-get_cpu_var-usa.patch index 42f107faf..c3fd635dd 100644 --- a/debian/patches-rt/0213-mm-zsmalloc-copy-with-get_cpu_var-and-locking.patch +++ b/debian/patches-rt/mm-zsmalloc-Replace-bit-spinlock-and-get_cpu_var-usa.patch @@ -1,24 +1,60 @@ -From c88726c43100e69359fc772639774c92086c5d15 Mon Sep 17 00:00:00 2001 From: Mike Galbraith <umgwanakikbuti@gmail.com> -Date: Tue, 22 Mar 2016 11:16:09 +0100 -Subject: [PATCH 213/296] mm/zsmalloc: copy with get_cpu_var() and locking -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz +Date: Tue, 28 Sep 2021 09:38:47 +0200 +Subject: [PATCH] mm/zsmalloc: Replace bit spinlock and get_cpu_var() usage. +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz -get_cpu_var() disables preemption and triggers a might_sleep() splat later. -This is replaced with get_locked_var(). -This bitspinlocks are replaced with a proper mutex which requires a slightly -larger struct to allocate. +For efficiency reasons, zsmalloc is using a slim `handle'. The value is +the address of a memory allocation of 4 or 8 bytes depending on the size +of the long data type. The lowest bit in that allocated memory is used +as a bit spin lock. +The usage of the bit spin lock is problematic because with the bit spin +lock held zsmalloc acquires a rwlock_t and spinlock_t which are both +sleeping locks on PREEMPT_RT and therefore must not be acquired with +disabled preemption. + +Extend the handle to struct zsmalloc_handle which holds the old handle as +addr and a spinlock_t which replaces the bit spinlock. Replace all the +wrapper functions accordingly. + +The usage of get_cpu_var() in zs_map_object() is problematic because +it disables preemption and makes it impossible to acquire any sleeping +lock on PREEMPT_RT such as a spinlock_t. +Replace the get_cpu_var() usage with a local_lock_t which is embedded +struct mapping_area. It ensures that the access the struct is +synchronized against all users on the same CPU. + +This survived LTP testing. Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com> -[bigeasy: replace the bitspin_lock() with a mutex, get_locked_var(). Mike then -fixed the size magic] +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +[bigeasy: replace the bitspin_lock() with a mutex, get_locked_var() and + patch description. Mike then fixed the size magic and made handle lock + spinlock_t.] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- - mm/zsmalloc.c | 85 +++++++++++++++++++++++++++++++++++++++++++++++---- - 1 file changed, 79 insertions(+), 6 deletions(-) + mm/Kconfig | 3 -- + mm/zsmalloc.c | 84 +++++++++++++++++++++++++++++++++++++++++++++++++++++----- + 2 files changed, 79 insertions(+), 8 deletions(-) -diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c -index 7a0b79b0a689..277d426c881f 100644 +--- a/mm/Kconfig ++++ b/mm/Kconfig +@@ -640,7 +640,6 @@ config ZSWAP_ZPOOL_DEFAULT_Z3FOLD + + config ZSWAP_ZPOOL_DEFAULT_ZSMALLOC + bool "zsmalloc" +- depends on !PREEMPT_RT + select ZSMALLOC + help + Use the zsmalloc allocator as the default allocator. +@@ -691,7 +690,7 @@ config Z3FOLD + + config ZSMALLOC + tristate "Memory allocator for compressed pages" +- depends on MMU && !PREEMPT_RT ++ depends on MMU + help + zsmalloc is a slab-based memory allocator designed to store + compressed RAM pages. zsmalloc uses virtual memory mapping --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -57,6 +57,7 @@ @@ -37,7 +73,7 @@ index 7a0b79b0a689..277d426c881f 100644 + +struct zsmalloc_handle { + unsigned long addr; -+ struct mutex lock; ++ spinlock_t lock; +}; + +#define ZS_HANDLE_ALLOC_SIZE (sizeof(struct zsmalloc_handle)) @@ -58,7 +94,7 @@ index 7a0b79b0a689..277d426c881f 100644 char *vm_buf; /* copy buffer for objects that span pages */ char *vm_addr; /* address of kmap_atomic()'ed pages */ enum zs_mapmode vm_mm; /* mapping mode */ -@@ -322,7 +338,7 @@ static void SetZsPageMovable(struct zs_pool *pool, struct zspage *zspage) {} +@@ -322,7 +338,7 @@ static void SetZsPageMovable(struct zs_p static int create_cache(struct zs_pool *pool) { @@ -67,7 +103,7 @@ index 7a0b79b0a689..277d426c881f 100644 0, 0, NULL); if (!pool->handle_cachep) return 1; -@@ -346,9 +362,26 @@ static void destroy_cache(struct zs_pool *pool) +@@ -346,10 +362,27 @@ static void destroy_cache(struct zs_pool static unsigned long cache_alloc_handle(struct zs_pool *pool, gfp_t gfp) { @@ -81,22 +117,23 @@ index 7a0b79b0a689..277d426c881f 100644 + if (p) { + struct zsmalloc_handle *zh = p; + -+ mutex_init(&zh->lock); ++ spin_lock_init(&zh->lock); + } +#endif + return (unsigned long)p; -+} -+ + } + +#ifdef CONFIG_PREEMPT_RT +static struct zsmalloc_handle *zs_get_pure_handle(unsigned long handle) +{ -+ return (void *)(handle &~((1 << OBJ_TAG_BITS) - 1)); - } ++ return (void *)(handle & ~((1 << OBJ_TAG_BITS) - 1)); ++} +#endif - ++ static void cache_free_handle(struct zs_pool *pool, unsigned long handle) { -@@ -368,12 +401,18 @@ static void cache_free_zspage(struct zs_pool *pool, struct zspage *zspage) + kmem_cache_free(pool->handle_cachep, (void *)handle); +@@ -368,12 +401,18 @@ static void cache_free_zspage(struct zs_ static void record_obj(unsigned long handle, unsigned long obj) { @@ -115,19 +152,18 @@ index 7a0b79b0a689..277d426c881f 100644 } /* zpool driver */ -@@ -455,7 +494,10 @@ MODULE_ALIAS("zpool-zsmalloc"); +@@ -455,7 +494,9 @@ MODULE_ALIAS("zpool-zsmalloc"); #endif /* CONFIG_ZPOOL */ /* per-cpu VM mapping areas for zspage accesses that cross page boundaries */ -static DEFINE_PER_CPU(struct mapping_area, zs_map_area); +static DEFINE_PER_CPU(struct mapping_area, zs_map_area) = { -+ /* XXX remove this and use a spin_lock_t in pin_tag() */ + .lock = INIT_LOCAL_LOCK(lock), +}; static bool is_zspage_isolated(struct zspage *zspage) { -@@ -865,7 +907,13 @@ static unsigned long location_to_obj(struct page *page, unsigned int obj_idx) +@@ -862,7 +903,13 @@ static unsigned long location_to_obj(str static unsigned long handle_to_obj(unsigned long handle) { @@ -141,14 +177,14 @@ index 7a0b79b0a689..277d426c881f 100644 } static unsigned long obj_to_head(struct page *page, void *obj) -@@ -879,22 +927,46 @@ static unsigned long obj_to_head(struct page *page, void *obj) +@@ -876,22 +923,46 @@ static unsigned long obj_to_head(struct static inline int testpin_tag(unsigned long handle) { +#ifdef CONFIG_PREEMPT_RT + struct zsmalloc_handle *zh = zs_get_pure_handle(handle); + -+ return mutex_is_locked(&zh->lock); ++ return spin_is_locked(&zh->lock); +#else return bit_spin_is_locked(HANDLE_PIN_BIT, (unsigned long *)handle); +#endif @@ -159,7 +195,7 @@ index 7a0b79b0a689..277d426c881f 100644 +#ifdef CONFIG_PREEMPT_RT + struct zsmalloc_handle *zh = zs_get_pure_handle(handle); + -+ return mutex_trylock(&zh->lock); ++ return spin_trylock(&zh->lock); +#else return bit_spin_trylock(HANDLE_PIN_BIT, (unsigned long *)handle); +#endif @@ -170,7 +206,7 @@ index 7a0b79b0a689..277d426c881f 100644 +#ifdef CONFIG_PREEMPT_RT + struct zsmalloc_handle *zh = zs_get_pure_handle(handle); + -+ return mutex_lock(&zh->lock); ++ return spin_lock(&zh->lock); +#else bit_spin_lock(HANDLE_PIN_BIT, (unsigned long *)handle); +#endif @@ -181,14 +217,14 @@ index 7a0b79b0a689..277d426c881f 100644 +#ifdef CONFIG_PREEMPT_RT + struct zsmalloc_handle *zh = zs_get_pure_handle(handle); + -+ return mutex_unlock(&zh->lock); ++ return spin_unlock(&zh->lock); +#else bit_spin_unlock(HANDLE_PIN_BIT, (unsigned long *)handle); +#endif } static void reset_page(struct page *page) -@@ -1278,7 +1350,8 @@ void *zs_map_object(struct zs_pool *pool, unsigned long handle, +@@ -1274,7 +1345,8 @@ void *zs_map_object(struct zs_pool *pool class = pool->size_class[class_idx]; off = (class->size * obj_idx) & ~PAGE_MASK; @@ -198,7 +234,7 @@ index 7a0b79b0a689..277d426c881f 100644 area->vm_mm = mm; if (off + class->size <= PAGE_SIZE) { /* this object is contained entirely within a page */ -@@ -1332,7 +1405,7 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle) +@@ -1328,7 +1400,7 @@ void zs_unmap_object(struct zs_pool *poo __zs_unmap_object(area, pages, off, class->size); } @@ -207,6 +243,3 @@ index 7a0b79b0a689..277d426c881f 100644 migrate_read_unlock(zspage); unpin_tag(handle); --- -2.30.2 - diff --git a/debian/patches-rt/0111-mm-workingset-replace-IRQ-off-check-with-a-lockdep-a.patch b/debian/patches-rt/mm__workingset__replace_IRQ-off_check_with_a_lockdep_assert..patch index d1294daa3..c30695025 100644 --- a/debian/patches-rt/0111-mm-workingset-replace-IRQ-off-check-with-a-lockdep-a.patch +++ b/debian/patches-rt/mm__workingset__replace_IRQ-off_check_with_a_lockdep_assert..patch @@ -1,9 +1,9 @@ -From a0d59d81859c0672bf1c68069e662bac20fe61c2 Mon Sep 17 00:00:00 2001 +Subject: mm: workingset: replace IRQ-off check with a lockdep assert. +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Mon Feb 11 10:40:46 2019 +0100 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Mon, 11 Feb 2019 10:40:46 +0100 -Subject: [PATCH 111/296] mm: workingset: replace IRQ-off check with a lockdep - assert. -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz Commit @@ -17,15 +17,15 @@ held. Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Link: https://lkml.kernel.org/r/20190211113829.sqf6bdi4c4cdd3rp@linutronix.de --- - mm/workingset.c | 5 ++++- + mm/workingset.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) - -diff --git a/mm/workingset.c b/mm/workingset.c -index 975a4d2dd02e..c3d098c01052 100644 +--- --- a/mm/workingset.c +++ b/mm/workingset.c -@@ -432,6 +432,8 @@ static struct list_lru shadow_nodes; +@@ -433,6 +433,8 @@ static struct list_lru shadow_nodes; void workingset_update_node(struct xa_node *node) { @@ -34,7 +34,7 @@ index 975a4d2dd02e..c3d098c01052 100644 /* * Track non-empty nodes that contain only shadow entries; * unlink those that contain pages or are being freed. -@@ -440,7 +442,8 @@ void workingset_update_node(struct xa_node *node) +@@ -441,7 +443,8 @@ void workingset_update_node(struct xa_no * already where they should be. The list_empty() test is safe * as node->private_list is protected by the i_pages lock. */ @@ -44,6 +44,3 @@ index 975a4d2dd02e..c3d098c01052 100644 if (node->count && node->count == node->nr_values) { if (list_empty(&node->private_list)) { --- -2.30.2 - diff --git a/debian/patches-rt/mm_allow_only_slub_on_preempt_rt.patch b/debian/patches-rt/mm_allow_only_slub_on_preempt_rt.patch new file mode 100644 index 000000000..0b9f4da8c --- /dev/null +++ b/debian/patches-rt/mm_allow_only_slub_on_preempt_rt.patch @@ -0,0 +1,53 @@ +From: Ingo Molnar <mingo@kernel.org> +Subject: mm: Allow only SLUB on PREEMPT_RT +Date: Fri, 3 Jul 2009 08:44:03 -0500 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +Memory allocators may disable interrupts or preemption as part of the +allocation and freeing process. For PREEMPT_RT it is important that +these sections remain deterministic and short and therefore don't depend +on the size of the memory to allocate/ free or the inner state of the +algorithm. + +Until v3.12-RT the SLAB allocator was an option but involved several +changes to meet all the requirements. The SLUB design fits better with +PREEMPT_RT model and so the SLAB patches were dropped in the 3.12-RT +patchset. Comparing the two allocator, SLUB outperformed SLAB in both +throughput (time needed to allocate and free memory) and the maximal +latency of the system measured with cyclictest during hackbench. + +SLOB was never evaluated since it was unlikely that it preforms better +than SLAB. During a quick test, the kernel crashed with SLOB enabled +during boot. + +Disable SLAB and SLOB on PREEMPT_RT. + +[bigeasy: commit description.] + +Signed-off-by: Ingo Molnar <mingo@kernel.org> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Acked-by: Vlastimil Babka <vbabka@suse.cz> +Link: https://lore.kernel.org/r/20211015210336.gen3tib33ig5q2md@linutronix.de +--- + init/Kconfig | 2 ++ + 1 file changed, 2 insertions(+) + +--- a/init/Kconfig ++++ b/init/Kconfig +@@ -1896,6 +1896,7 @@ choice + + config SLAB + bool "SLAB" ++ depends on !PREEMPT_RT + select HAVE_HARDENED_USERCOPY_ALLOCATOR + help + The regular slab allocator that is established and known to work +@@ -1916,6 +1917,7 @@ config SLUB + config SLOB + depends on EXPERT + bool "SLOB (Simple Allocator)" ++ depends on !PREEMPT_RT + help + SLOB replaces the stock allocator with a drastically simpler + allocator. SLOB is generally more space efficient but diff --git a/debian/patches-rt/mm_page_alloc_use_migrate_disable_in_drain_local_pages_wq.patch b/debian/patches-rt/mm_page_alloc_use_migrate_disable_in_drain_local_pages_wq.patch new file mode 100644 index 000000000..af9c84a12 --- /dev/null +++ b/debian/patches-rt/mm_page_alloc_use_migrate_disable_in_drain_local_pages_wq.patch @@ -0,0 +1,36 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Subject: mm: page_alloc: Use migrate_disable() in drain_local_pages_wq() +Date: Fri, 15 Oct 2021 23:09:33 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +drain_local_pages_wq() disables preemption to avoid CPU migration during +CPU hotplug and can't use cpus_read_lock(). + +Using migrate_disable() works here, too. The scheduler won't take the +CPU offline until the task left the migrate-disable section. +The problem with disabled preemption here is that drain_local_pages() +acquires locks which are turned into sleeping locks on PREEMPT_RT and +can't be acquired with disabled preemption. + +Use migrate_disable() in drain_local_pages_wq(). + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20211015210933.viw6rjvo64qtqxn4@linutronix.de +--- + mm/page_alloc.c | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) +--- +--- a/mm/page_alloc.c ++++ b/mm/page_alloc.c +@@ -3147,9 +3147,9 @@ static void drain_local_pages_wq(struct + * cpu which is alright but we also have to make sure to not move to + * a different one. + */ +- preempt_disable(); ++ migrate_disable(); + drain_local_pages(drain->zone); +- preempt_enable(); ++ migrate_enable(); + } + + /* diff --git a/debian/patches-rt/mm_scatterlist_replace_the_preemptible_warning_in_sg_miter_stop.patch b/debian/patches-rt/mm_scatterlist_replace_the_preemptible_warning_in_sg_miter_stop.patch new file mode 100644 index 000000000..89edbaafb --- /dev/null +++ b/debian/patches-rt/mm_scatterlist_replace_the_preemptible_warning_in_sg_miter_stop.patch @@ -0,0 +1,86 @@ +From: Thomas Gleixner <tglx@linutronix.de> +Subject: mm/scatterlist: Replace the !preemptible warning in sg_miter_stop() +Date: Fri, 15 Oct 2021 23:14:09 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +sg_miter_stop() checks for disabled preemption before unmapping a page +via kunmap_atomic(). The kernel doc mentions under context that +preemption must be disabled if SG_MITER_ATOMIC is set. + +There is no active requirement for the caller to have preemption +disabled before invoking sg_mitter_stop(). The sg_mitter_*() +implementation itself has no such requirement. +In fact, preemption is disabled by kmap_atomic() as part of +sg_miter_next() and remains disabled as long as there is an active +SG_MITER_ATOMIC mapping. This is a consequence of kmap_atomic() and not +a requirement for sg_mitter_*() itself. +The user chooses SG_MITER_ATOMIC because it uses the API in a context +where blocking is not possible or blocking is possible but he chooses a +lower weight mapping which is not available on all CPUs and so it might +need less overhead to setup at a price that now preemption will be +disabled. + +The kmap_atomic() implementation on PREEMPT_RT does not disable +preemption. It simply disables CPU migration to ensure that the task +remains on the same CPU while the caller remains preemptible. This in +turn triggers the warning in sg_miter_stop() because preemption is +allowed. + +The PREEMPT_RT and !PREEMPT_RT implementation of kmap_atomic() disable +pagefaults as a requirement. It is sufficient to check for this instead +of disabled preemption. + +Check for disabled pagefault handler in the SG_MITER_ATOMIC case. Remove +the "preemption disabled" part from the kernel doc as the sg_milter*() +implementation does not care. + +[bigeasy: commit description. ] + +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20211015211409.cqopacv3pxdwn2ty@linutronix.de +--- + lib/scatterlist.c | 11 ++++------- + 1 file changed, 4 insertions(+), 7 deletions(-) + +--- a/lib/scatterlist.c ++++ b/lib/scatterlist.c +@@ -828,8 +828,7 @@ static bool sg_miter_get_next_page(struc + * stops @miter. + * + * Context: +- * Don't care if @miter is stopped, or not proceeded yet. +- * Otherwise, preemption disabled if the SG_MITER_ATOMIC is set. ++ * Don't care. + * + * Returns: + * true if @miter contains the valid mapping. false if end of sg +@@ -865,8 +864,7 @@ EXPORT_SYMBOL(sg_miter_skip); + * @miter->addr and @miter->length point to the current mapping. + * + * Context: +- * Preemption disabled if SG_MITER_ATOMIC. Preemption must stay disabled +- * till @miter is stopped. May sleep if !SG_MITER_ATOMIC. ++ * May sleep if !SG_MITER_ATOMIC. + * + * Returns: + * true if @miter contains the next mapping. false if end of sg +@@ -906,8 +904,7 @@ EXPORT_SYMBOL(sg_miter_next); + * need to be released during iteration. + * + * Context: +- * Preemption disabled if the SG_MITER_ATOMIC is set. Don't care +- * otherwise. ++ * Don't care otherwise. + */ + void sg_miter_stop(struct sg_mapping_iter *miter) + { +@@ -922,7 +919,7 @@ void sg_miter_stop(struct sg_mapping_ite + flush_dcache_page(miter->page); + + if (miter->__flags & SG_MITER_ATOMIC) { +- WARN_ON_ONCE(preemptible()); ++ WARN_ON_ONCE(!pagefault_disabled()); + kunmap_atomic(miter->addr); + } else + kunmap(miter->page); diff --git a/debian/patches-rt/mm_vmalloc__Another_preempt_disable_region_which_sucks.patch b/debian/patches-rt/mm_vmalloc__Another_preempt_disable_region_which_sucks.patch new file mode 100644 index 000000000..968754912 --- /dev/null +++ b/debian/patches-rt/mm_vmalloc__Another_preempt_disable_region_which_sucks.patch @@ -0,0 +1,51 @@ +Subject: mm/vmalloc: Another preempt disable region which sucks +From: Thomas Gleixner <tglx@linutronix.de> +Date: Tue Jul 12 11:39:36 2011 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +From: Thomas Gleixner <tglx@linutronix.de> + +Avoid the preempt disable version of get_cpu_var(). The inner-lock should +provide enough serialisation. + +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +--- + mm/vmalloc.c | 10 ++++++---- + 1 file changed, 6 insertions(+), 4 deletions(-) +--- +--- a/mm/vmalloc.c ++++ b/mm/vmalloc.c +@@ -1918,11 +1918,12 @@ static void *new_vmap_block(unsigned int + return ERR_PTR(err); + } + +- vbq = &get_cpu_var(vmap_block_queue); ++ get_cpu_light(); ++ vbq = this_cpu_ptr(&vmap_block_queue); + spin_lock(&vbq->lock); + list_add_tail_rcu(&vb->free_list, &vbq->free); + spin_unlock(&vbq->lock); +- put_cpu_var(vmap_block_queue); ++ put_cpu_light(); + + return vaddr; + } +@@ -2001,7 +2002,8 @@ static void *vb_alloc(unsigned long size + order = get_order(size); + + rcu_read_lock(); +- vbq = &get_cpu_var(vmap_block_queue); ++ get_cpu_light(); ++ vbq = this_cpu_ptr(&vmap_block_queue); + list_for_each_entry_rcu(vb, &vbq->free, free_list) { + unsigned long pages_off; + +@@ -2024,7 +2026,7 @@ static void *vb_alloc(unsigned long size + break; + } + +- put_cpu_var(vmap_block_queue); ++ put_cpu_light(); + rcu_read_unlock(); + + /* Allocate new block if nothing was found */ diff --git a/debian/patches-rt/net-core-disable-NET_RX_BUSY_POLL-on-PREEMPT_RT.patch b/debian/patches-rt/net-core-disable-NET_RX_BUSY_POLL-on-PREEMPT_RT.patch new file mode 100644 index 000000000..dba529870 --- /dev/null +++ b/debian/patches-rt/net-core-disable-NET_RX_BUSY_POLL-on-PREEMPT_RT.patch @@ -0,0 +1,37 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Fri, 1 Oct 2021 16:58:41 +0200 +Subject: [PATCH] net/core: disable NET_RX_BUSY_POLL on PREEMPT_RT +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +napi_busy_loop() disables preemption and performs a NAPI poll. We can't acquire +sleeping locks with disabled preemption which would be required while +__napi_poll() invokes the callback of the driver. + +A threaded interrupt performing the NAPI-poll can be preempted on PREEMPT_RT. +A RT thread on another CPU may observe NAPIF_STATE_SCHED bit set and busy-spin +until it is cleared or its spin time runs out. Given it is the task with the +highest priority it will never observe the NEED_RESCHED bit set. +In this case the time is better spent by simply sleeping. + +The NET_RX_BUSY_POLL is disabled by default (the system wide sysctls for +poll/read are set to zero). Disabling NET_RX_BUSY_POLL on PREEMPT_RT to avoid +wrong locking context in case it is used. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20211001145841.2308454-1-bigeasy@linutronix.de +Signed-off-by: Jakub Kicinski <kuba@kernel.org> +--- + net/Kconfig | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/net/Kconfig ++++ b/net/Kconfig +@@ -294,7 +294,7 @@ config CGROUP_NET_CLASSID + + config NET_RX_BUSY_POLL + bool +- default y ++ default y if !PREEMPT_RT + + config BQL + bool diff --git a/debian/patches-rt/net-sched-Allow-statistics-reads-from-softirq.patch b/debian/patches-rt/net-sched-Allow-statistics-reads-from-softirq.patch new file mode 100644 index 000000000..7a5666932 --- /dev/null +++ b/debian/patches-rt/net-sched-Allow-statistics-reads-from-softirq.patch @@ -0,0 +1,34 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Tue, 19 Oct 2021 12:12:04 +0200 +Subject: [PATCH] net: sched: Allow statistics reads from softirq. +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +Eric reported that the rate estimator reads statics from the softirq +which in turn triggers a warning introduced in the statistics rework. + +The warning is too cautious. The updates happen in the softirq context +so reads from softirq are fine since the writes can not be preempted. +The updates/writes happen during qdisc_run() which ensures one writer +and the softirq context. +The remaining bad context for reading statistics remains in hard-IRQ +because it may preempt a writer. + +Fixes: 29cbcd8582837 ("net: sched: Remove Qdisc::running sequence counter") +Reported-by: Eric Dumazet <eric.dumazet@gmail.com> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: David S. Miller <davem@davemloft.net> +--- + net/core/gen_stats.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/net/core/gen_stats.c ++++ b/net/core/gen_stats.c +@@ -154,7 +154,7 @@ void gnet_stats_add_basic(struct gnet_st + u64 bytes = 0; + u64 packets = 0; + +- WARN_ON_ONCE((cpu || running) && !in_task()); ++ WARN_ON_ONCE((cpu || running) && in_hardirq()); + + if (cpu) { + gnet_stats_add_basic_cpu(bstats, cpu); diff --git a/debian/patches-rt/net-sched-fix-logic-error-in-qdisc_run_begin.patch b/debian/patches-rt/net-sched-fix-logic-error-in-qdisc_run_begin.patch new file mode 100644 index 000000000..55c4b09d6 --- /dev/null +++ b/debian/patches-rt/net-sched-fix-logic-error-in-qdisc_run_begin.patch @@ -0,0 +1,36 @@ +From: Eric Dumazet <edumazet@google.com> +Date: Mon, 18 Oct 2021 17:34:01 -0700 +Subject: [PATCH] net: sched: fix logic error in qdisc_run_begin() +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +For non TCQ_F_NOLOCK qdisc, qdisc_run_begin() tries to set +__QDISC_STATE_RUNNING and should return true if the bit was not set. + +test_and_set_bit() returns old bit value, therefore we need to invert. + +Fixes: 29cbcd858283 ("net: sched: Remove Qdisc::running sequence counter") +Signed-off-by: Eric Dumazet <edumazet@google.com> +Cc: Ahmed S. Darwish <a.darwish@linutronix.de> +Tested-by: Ido Schimmel <idosch@nvidia.com> +Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Tested-by: Toke Høiland-Jørgensen <toke@redhat.com> +Signed-off-by: Jakub Kicinski <kuba@kernel.org> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + include/net/sch_generic.h | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/include/net/sch_generic.h ++++ b/include/net/sch_generic.h +@@ -217,7 +217,7 @@ static inline bool qdisc_run_begin(struc + */ + return spin_trylock(&qdisc->seqlock); + } +- return test_and_set_bit(__QDISC_STATE_RUNNING, &qdisc->state); ++ return !test_and_set_bit(__QDISC_STATE_RUNNING, &qdisc->state); + } + + static inline void qdisc_run_end(struct Qdisc *qdisc) diff --git a/debian/patches-rt/net-sched-gred-dynamically-allocate-tc_gred_qopt_off.patch b/debian/patches-rt/net-sched-gred-dynamically-allocate-tc_gred_qopt_off.patch new file mode 100644 index 000000000..195400b9a --- /dev/null +++ b/debian/patches-rt/net-sched-gred-dynamically-allocate-tc_gred_qopt_off.patch @@ -0,0 +1,127 @@ +From: Arnd Bergmann <arnd@arndb.de> +Date: Tue, 26 Oct 2021 12:07:11 +0200 +Subject: [PATCH] net: sched: gred: dynamically allocate tc_gred_qopt_offload +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The tc_gred_qopt_offload structure has grown too big to be on the +stack for 32-bit architectures after recent changes. + +net/sched/sch_gred.c:903:13: error: stack frame size (1180) exceeds limit (1024) in 'gred_destroy' [-Werror,-Wframe-larger-than] +net/sched/sch_gred.c:310:13: error: stack frame size (1212) exceeds limit (1024) in 'gred_offload' [-Werror,-Wframe-larger-than] + +Use dynamic allocation per qdisc to avoid this. + +Fixes: 50dc9a8572aa ("net: sched: Merge Qdisc::bstats and Qdisc::cpu_bstats data types") +Fixes: 67c9e6270f30 ("net: sched: Protect Qdisc::bstats with u64_stats") +Suggested-by: Jakub Kicinski <kuba@kernel.org> +Signed-off-by: Arnd Bergmann <arnd@arndb.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20211026100711.nalhttf6mbe6sudx@linutronix.de +Signed-off-by: Jakub Kicinski <kuba@kernel.org> +--- + net/sched/sch_gred.c | 50 ++++++++++++++++++++++++++++++-------------------- + 1 file changed, 30 insertions(+), 20 deletions(-) + +--- a/net/sched/sch_gred.c ++++ b/net/sched/sch_gred.c +@@ -56,6 +56,7 @@ struct gred_sched { + u32 DPs; + u32 def; + struct red_vars wred_set; ++ struct tc_gred_qopt_offload *opt; + }; + + static inline int gred_wred_mode(struct gred_sched *table) +@@ -311,42 +312,43 @@ static void gred_offload(struct Qdisc *s + { + struct gred_sched *table = qdisc_priv(sch); + struct net_device *dev = qdisc_dev(sch); +- struct tc_gred_qopt_offload opt = { +- .command = command, +- .handle = sch->handle, +- .parent = sch->parent, +- }; ++ struct tc_gred_qopt_offload *opt = table->opt; + + if (!tc_can_offload(dev) || !dev->netdev_ops->ndo_setup_tc) + return; + ++ memset(opt, 0, sizeof(*opt)); ++ opt->command = command; ++ opt->handle = sch->handle; ++ opt->parent = sch->parent; ++ + if (command == TC_GRED_REPLACE) { + unsigned int i; + +- opt.set.grio_on = gred_rio_mode(table); +- opt.set.wred_on = gred_wred_mode(table); +- opt.set.dp_cnt = table->DPs; +- opt.set.dp_def = table->def; ++ opt->set.grio_on = gred_rio_mode(table); ++ opt->set.wred_on = gred_wred_mode(table); ++ opt->set.dp_cnt = table->DPs; ++ opt->set.dp_def = table->def; + + for (i = 0; i < table->DPs; i++) { + struct gred_sched_data *q = table->tab[i]; + + if (!q) + continue; +- opt.set.tab[i].present = true; +- opt.set.tab[i].limit = q->limit; +- opt.set.tab[i].prio = q->prio; +- opt.set.tab[i].min = q->parms.qth_min >> q->parms.Wlog; +- opt.set.tab[i].max = q->parms.qth_max >> q->parms.Wlog; +- opt.set.tab[i].is_ecn = gred_use_ecn(q); +- opt.set.tab[i].is_harddrop = gred_use_harddrop(q); +- opt.set.tab[i].probability = q->parms.max_P; +- opt.set.tab[i].backlog = &q->backlog; ++ opt->set.tab[i].present = true; ++ opt->set.tab[i].limit = q->limit; ++ opt->set.tab[i].prio = q->prio; ++ opt->set.tab[i].min = q->parms.qth_min >> q->parms.Wlog; ++ opt->set.tab[i].max = q->parms.qth_max >> q->parms.Wlog; ++ opt->set.tab[i].is_ecn = gred_use_ecn(q); ++ opt->set.tab[i].is_harddrop = gred_use_harddrop(q); ++ opt->set.tab[i].probability = q->parms.max_P; ++ opt->set.tab[i].backlog = &q->backlog; + } +- opt.set.qstats = &sch->qstats; ++ opt->set.qstats = &sch->qstats; + } + +- dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_GRED, &opt); ++ dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_GRED, opt); + } + + static int gred_offload_dump_stats(struct Qdisc *sch) +@@ -731,6 +733,7 @@ static int gred_change(struct Qdisc *sch + static int gred_init(struct Qdisc *sch, struct nlattr *opt, + struct netlink_ext_ack *extack) + { ++ struct gred_sched *table = qdisc_priv(sch); + struct nlattr *tb[TCA_GRED_MAX + 1]; + int err; + +@@ -754,6 +757,12 @@ static int gred_init(struct Qdisc *sch, + sch->limit = qdisc_dev(sch)->tx_queue_len + * psched_mtu(qdisc_dev(sch)); + ++ if (qdisc_dev(sch)->netdev_ops->ndo_setup_tc) { ++ table->opt = kzalloc(sizeof(*table->opt), GFP_KERNEL); ++ if (!table->opt) ++ return -ENOMEM; ++ } ++ + return gred_change_table_def(sch, tb[TCA_GRED_DPS], extack); + } + +@@ -910,6 +919,7 @@ static void gred_destroy(struct Qdisc *s + gred_destroy_vq(table->tab[i]); + } + gred_offload(sch, TC_GRED_DESTROY); ++ kfree(table->opt); + } + + static struct Qdisc_ops gred_qdisc_ops __read_mostly = { diff --git a/debian/patches-rt/net-sched-remove-one-pair-of-atomic-operations.patch b/debian/patches-rt/net-sched-remove-one-pair-of-atomic-operations.patch new file mode 100644 index 000000000..d249376e8 --- /dev/null +++ b/debian/patches-rt/net-sched-remove-one-pair-of-atomic-operations.patch @@ -0,0 +1,76 @@ +From: Eric Dumazet <edumazet@google.com> +Date: Mon, 18 Oct 2021 17:34:02 -0700 +Subject: [PATCH] net: sched: remove one pair of atomic operations +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +__QDISC_STATE_RUNNING is only set/cleared from contexts owning qdisc lock. + +Thus we can use less expensive bit operations, as we were doing +before commit f9eb8aea2a1e ("net_sched: transform qdisc running bit into a seqcount") + +Fixes: 29cbcd858283 ("net: sched: Remove Qdisc::running sequence counter") +Signed-off-by: Eric Dumazet <edumazet@google.com> +Cc: Ahmed S. Darwish <a.darwish@linutronix.de> +Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Tested-by: Toke Høiland-Jørgensen <toke@redhat.com> +Signed-off-by: Jakub Kicinski <kuba@kernel.org> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + include/net/sch_generic.h | 12 ++++++++---- + 1 file changed, 8 insertions(+), 4 deletions(-) + +--- a/include/net/sch_generic.h ++++ b/include/net/sch_generic.h +@@ -38,10 +38,13 @@ enum qdisc_state_t { + __QDISC_STATE_DEACTIVATED, + __QDISC_STATE_MISSED, + __QDISC_STATE_DRAINING, ++}; ++ ++enum qdisc_state2_t { + /* Only for !TCQ_F_NOLOCK qdisc. Never access it directly. + * Use qdisc_run_begin/end() or qdisc_is_running() instead. + */ +- __QDISC_STATE_RUNNING, ++ __QDISC_STATE2_RUNNING, + }; + + #define QDISC_STATE_MISSED BIT(__QDISC_STATE_MISSED) +@@ -114,6 +117,7 @@ struct Qdisc { + struct gnet_stats_basic_sync bstats; + struct gnet_stats_queue qstats; + unsigned long state; ++ unsigned long state2; /* must be written under qdisc spinlock */ + struct Qdisc *next_sched; + struct sk_buff_head skb_bad_txq; + +@@ -154,7 +158,7 @@ static inline bool qdisc_is_running(stru + { + if (qdisc->flags & TCQ_F_NOLOCK) + return spin_is_locked(&qdisc->seqlock); +- return test_bit(__QDISC_STATE_RUNNING, &qdisc->state); ++ return test_bit(__QDISC_STATE2_RUNNING, &qdisc->state2); + } + + static inline bool nolock_qdisc_is_empty(const struct Qdisc *qdisc) +@@ -217,7 +221,7 @@ static inline bool qdisc_run_begin(struc + */ + return spin_trylock(&qdisc->seqlock); + } +- return !test_and_set_bit(__QDISC_STATE_RUNNING, &qdisc->state); ++ return !__test_and_set_bit(__QDISC_STATE2_RUNNING, &qdisc->state2); + } + + static inline void qdisc_run_end(struct Qdisc *qdisc) +@@ -229,7 +233,7 @@ static inline void qdisc_run_end(struct + &qdisc->state))) + __netif_schedule(qdisc); + } else { +- clear_bit(__QDISC_STATE_RUNNING, &qdisc->state); ++ __clear_bit(__QDISC_STATE2_RUNNING, &qdisc->state2); + } + } + diff --git a/debian/patches-rt/net-sched-sch_ets-properly-init-all-active-DRR-list-.patch b/debian/patches-rt/net-sched-sch_ets-properly-init-all-active-DRR-list-.patch new file mode 100644 index 000000000..d04bb68f9 --- /dev/null +++ b/debian/patches-rt/net-sched-sch_ets-properly-init-all-active-DRR-list-.patch @@ -0,0 +1,66 @@ +From: Davide Caratti <dcaratti@redhat.com> +Date: Thu, 7 Oct 2021 15:05:02 +0200 +Subject: [PATCH] net/sched: sch_ets: properly init all active DRR list handles +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +leaf classes of ETS qdiscs are served in strict priority or deficit round +robin (DRR), depending on the value of 'nstrict'. Since this value can be +changed while traffic is running, we need to be sure that the active list +of DRR classes can be updated at any time, so: + +1) call INIT_LIST_HEAD(&alist) on all leaf classes in .init(), before the + first packet hits any of them. +2) ensure that 'alist' is not overwritten with zeros when a leaf class is + no more strict priority nor DRR (i.e. array elements beyond 'nbands'). + +Link: https://lore.kernel.org/netdev/YS%2FoZ+f0Nr8eQkzH@dcaratti.users.ipa.redhat.com +Suggested-by: Cong Wang <cong.wang@bytedance.com> +Signed-off-by: Davide Caratti <dcaratti@redhat.com> +Signed-off-by: David S. Miller <davem@davemloft.net> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + net/sched/sch_ets.c | 12 +++++++++--- + 1 file changed, 9 insertions(+), 3 deletions(-) + +--- a/net/sched/sch_ets.c ++++ b/net/sched/sch_ets.c +@@ -661,7 +661,6 @@ static int ets_qdisc_change(struct Qdisc + + q->nbands = nbands; + for (i = nstrict; i < q->nstrict; i++) { +- INIT_LIST_HEAD(&q->classes[i].alist); + if (q->classes[i].qdisc->q.qlen) { + list_add_tail(&q->classes[i].alist, &q->active); + q->classes[i].deficit = quanta[i]; +@@ -687,7 +686,11 @@ static int ets_qdisc_change(struct Qdisc + ets_offload_change(sch); + for (i = q->nbands; i < oldbands; i++) { + qdisc_put(q->classes[i].qdisc); +- memset(&q->classes[i], 0, sizeof(q->classes[i])); ++ q->classes[i].qdisc = NULL; ++ q->classes[i].quantum = 0; ++ q->classes[i].deficit = 0; ++ memset(&q->classes[i].bstats, 0, sizeof(q->classes[i].bstats)); ++ memset(&q->classes[i].qstats, 0, sizeof(q->classes[i].qstats)); + } + return 0; + } +@@ -696,7 +699,7 @@ static int ets_qdisc_init(struct Qdisc * + struct netlink_ext_ack *extack) + { + struct ets_sched *q = qdisc_priv(sch); +- int err; ++ int err, i; + + if (!opt) + return -EINVAL; +@@ -706,6 +709,9 @@ static int ets_qdisc_init(struct Qdisc * + return err; + + INIT_LIST_HEAD(&q->active); ++ for (i = 0; i < TCQ_ETS_MAX_BANDS; i++) ++ INIT_LIST_HEAD(&q->classes[i].alist); ++ + return ets_qdisc_change(sch, opt, extack); + } + diff --git a/debian/patches-rt/net-stats-Read-the-statistics-in-___gnet_stats_copy_.patch b/debian/patches-rt/net-stats-Read-the-statistics-in-___gnet_stats_copy_.patch new file mode 100644 index 000000000..b43ea9f32 --- /dev/null +++ b/debian/patches-rt/net-stats-Read-the-statistics-in-___gnet_stats_copy_.patch @@ -0,0 +1,90 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Thu, 21 Oct 2021 11:59:19 +0200 +Subject: [PATCH] net: stats: Read the statistics in ___gnet_stats_copy_basic() + instead of adding. +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +Since the rework, the statistics code always adds up the byte and packet +value(s). On 32bit architectures a seqcount_t is used in +gnet_stats_basic_sync to ensure that the 64bit values are not modified +during the read since two 32bit loads are required. The usage of a +seqcount_t requires a lock to ensure that only one writer is active at a +time. This lock leads to disabled preemption during the update. + +The lack of disabling preemption is now creating a warning as reported +by Naresh since the query done by gnet_stats_copy_basic() is in +preemptible context. + +For ___gnet_stats_copy_basic() there is no need to disable preemption +since the update is performed on stack and can't be modified by another +writer. Instead of disabling preemption, to avoid the warning, +simply create a read function to just read the values and return as u64. + +Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> +Fixes: 67c9e6270f301 ("net: sched: Protect Qdisc::bstats with u64_stats") +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20211021095919.bi3szpt3c2kcoiso@linutronix.de +--- + net/core/gen_stats.c | 43 +++++++++++++++++++++++++++++++++++++------ + 1 file changed, 37 insertions(+), 6 deletions(-) + +--- a/net/core/gen_stats.c ++++ b/net/core/gen_stats.c +@@ -171,20 +171,51 @@ void gnet_stats_add_basic(struct gnet_st + } + EXPORT_SYMBOL(gnet_stats_add_basic); + ++static void gnet_stats_read_basic(u64 *ret_bytes, u64 *ret_packets, ++ struct gnet_stats_basic_sync __percpu *cpu, ++ struct gnet_stats_basic_sync *b, bool running) ++{ ++ unsigned int start; ++ ++ if (cpu) { ++ u64 t_bytes = 0, t_packets = 0; ++ int i; ++ ++ for_each_possible_cpu(i) { ++ struct gnet_stats_basic_sync *bcpu = per_cpu_ptr(cpu, i); ++ unsigned int start; ++ u64 bytes, packets; ++ ++ do { ++ start = u64_stats_fetch_begin_irq(&bcpu->syncp); ++ bytes = u64_stats_read(&bcpu->bytes); ++ packets = u64_stats_read(&bcpu->packets); ++ } while (u64_stats_fetch_retry_irq(&bcpu->syncp, start)); ++ ++ t_bytes += bytes; ++ t_packets += packets; ++ } ++ *ret_bytes = t_bytes; ++ *ret_packets = t_packets; ++ return; ++ } ++ do { ++ if (running) ++ start = u64_stats_fetch_begin_irq(&b->syncp); ++ *ret_bytes = u64_stats_read(&b->bytes); ++ *ret_packets = u64_stats_read(&b->packets); ++ } while (running && u64_stats_fetch_retry_irq(&b->syncp, start)); ++} ++ + static int + ___gnet_stats_copy_basic(struct gnet_dump *d, + struct gnet_stats_basic_sync __percpu *cpu, + struct gnet_stats_basic_sync *b, + int type, bool running) + { +- struct gnet_stats_basic_sync bstats; + u64 bstats_bytes, bstats_packets; + +- gnet_stats_basic_sync_init(&bstats); +- gnet_stats_add_basic(&bstats, cpu, b, running); +- +- bstats_bytes = u64_stats_read(&bstats.bytes); +- bstats_packets = u64_stats_read(&bstats.packets); ++ gnet_stats_read_basic(&bstats_bytes, &bstats_packets, cpu, b, running); + + if (d->compat_tc_stats && type == TCA_STATS_BASIC) { + d->tc_stats.bytes = bstats_bytes; diff --git a/debian/patches-rt/0242-net-Dequeue-in-dev_cpu_dead-without-the-lock.patch b/debian/patches-rt/net__Dequeue_in_dev_cpu_dead_without_the_lock.patch index 86ecaa61e..f2314d16b 100644 --- a/debian/patches-rt/0242-net-Dequeue-in-dev_cpu_dead-without-the-lock.patch +++ b/debian/patches-rt/net__Dequeue_in_dev_cpu_dead_without_the_lock.patch @@ -1,8 +1,9 @@ -From 985ee984e1772134501f06f7d2b4e9368dd76234 Mon Sep 17 00:00:00 2001 +Subject: net: Dequeue in dev_cpu_dead() without the lock +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Wed Sep 16 16:15:39 2020 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Wed, 16 Sep 2020 16:15:39 +0200 -Subject: [PATCH 242/296] net: Dequeue in dev_cpu_dead() without the lock -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz Upstream uses skb_dequeue() to acquire lock of `input_pkt_queue'. The reason is to synchronize against a remote CPU which still thinks that the CPU is online @@ -15,15 +16,16 @@ for `input_pkt_queue' due to the IRQ-off nature of the context. Use the unlocked dequeue version for `input_pkt_queue'. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - net/core/dev.c | 2 +- + net/core/dev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/net/core/dev.c b/net/core/dev.c -index 5809e1bf5287..22f3a4cea216 100644 +--- --- a/net/core/dev.c +++ b/net/core/dev.c -@@ -10923,7 +10923,7 @@ static int dev_cpu_dead(unsigned int oldcpu) +@@ -11310,7 +11310,7 @@ static int dev_cpu_dead(unsigned int old netif_rx_ni(skb); input_queue_head_incr(oldsd); } @@ -32,6 +34,3 @@ index 5809e1bf5287..22f3a4cea216 100644 netif_rx_ni(skb); input_queue_head_incr(oldsd); } --- -2.30.2 - diff --git a/debian/patches-rt/0252-net-Remove-preemption-disabling-in-netif_rx.patch b/debian/patches-rt/net__Remove_preemption_disabling_in_netif_rx.patch index a690f7fe1..e59bea017 100644 --- a/debian/patches-rt/0252-net-Remove-preemption-disabling-in-netif_rx.patch +++ b/debian/patches-rt/net__Remove_preemption_disabling_in_netif_rx.patch @@ -1,8 +1,9 @@ -From 74452f979a73a407faa25f485831d939cc2449da Mon Sep 17 00:00:00 2001 +Subject: net: Remove preemption disabling in netif_rx() +From: Priyanka Jain <Priyanka.Jain@freescale.com> +Date: Thu May 17 09:35:11 2012 +0530 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Priyanka Jain <Priyanka.Jain@freescale.com> -Date: Thu, 17 May 2012 09:35:11 +0530 -Subject: [PATCH 252/296] net: Remove preemption disabling in netif_rx() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz 1)enqueue_to_backlog() (called from netif_rx) should be bind to a particluar CPU. This can be achieved by @@ -21,6 +22,7 @@ Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5 put_cpu_light() respectively Signed-off-by: Priyanka Jain <Priyanka.Jain@freescale.com> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rajan Srivastava <Rajan.Srivastava@freescale.com> Cc: <rostedt@goodmis.orgn> Link: http://lkml.kernel.org/r/1337227511-2271-1-git-send-email-Priyanka.Jain@freescale.com @@ -28,15 +30,16 @@ Link: http://lkml.kernel.org/r/1337227511-2271-1-git-send-email-Priyanka.Jain@fr Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [bigeasy: Remove assumption about migrate_disable() from the description.] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - net/core/dev.c | 8 ++++---- + net/core/dev.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) - -diff --git a/net/core/dev.c b/net/core/dev.c -index 15bbb627c90d..73341e6c2e5d 100644 +--- --- a/net/core/dev.c +++ b/net/core/dev.c -@@ -4794,7 +4794,7 @@ static int netif_rx_internal(struct sk_buff *skb) +@@ -4884,7 +4884,7 @@ static int netif_rx_internal(struct sk_b struct rps_dev_flow voidflow, *rflow = &voidflow; int cpu; @@ -45,7 +48,7 @@ index 15bbb627c90d..73341e6c2e5d 100644 rcu_read_lock(); cpu = get_rps_cpu(skb->dev, skb, &rflow); -@@ -4804,14 +4804,14 @@ static int netif_rx_internal(struct sk_buff *skb) +@@ -4894,14 +4894,14 @@ static int netif_rx_internal(struct sk_b ret = enqueue_to_backlog(skb, cpu, &rflow->last_qtail); rcu_read_unlock(); @@ -63,6 +66,3 @@ index 15bbb627c90d..73341e6c2e5d 100644 } return ret; } --- -2.30.2 - diff --git a/debian/patches-rt/0241-net-Use-skbufhead-with-raw-lock.patch b/debian/patches-rt/net__Use_skbufhead_with_raw_lock.patch index 83eaa660f..e742a388d 100644 --- a/debian/patches-rt/0241-net-Use-skbufhead-with-raw-lock.patch +++ b/debian/patches-rt/net__Use_skbufhead_with_raw_lock.patch @@ -1,24 +1,25 @@ -From 014d62067cfcd9a16b58a2f01c371751a0043934 Mon Sep 17 00:00:00 2001 +Subject: net: Use skbufhead with raw lock +From: Thomas Gleixner <tglx@linutronix.de> +Date: Tue Jul 12 15:38:34 2011 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 12 Jul 2011 15:38:34 +0200 -Subject: [PATCH 241/296] net: Use skbufhead with raw lock -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz Use the rps lock as rawlock so we can keep irq-off regions. It looks low latency. However we can't kfree() from this context therefore we defer this to the softirq and use the tofree_queue list for it (similar to process_queue). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - include/linux/skbuff.h | 7 +++++++ - net/core/dev.c | 6 +++--- + include/linux/skbuff.h | 7 +++++++ + net/core/dev.c | 6 +++--- 2 files changed, 10 insertions(+), 3 deletions(-) - -diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h -index a828cf99c521..2e4f80cd41df 100644 +--- --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h -@@ -295,6 +295,7 @@ struct sk_buff_head { +@@ -297,6 +297,7 @@ struct sk_buff_head { __u32 qlen; spinlock_t lock; @@ -26,7 +27,7 @@ index a828cf99c521..2e4f80cd41df 100644 }; struct sk_buff; -@@ -1884,6 +1885,12 @@ static inline void skb_queue_head_init(struct sk_buff_head *list) +@@ -1916,6 +1917,12 @@ static inline void skb_queue_head_init(s __skb_queue_head_init(list); } @@ -39,11 +40,9 @@ index a828cf99c521..2e4f80cd41df 100644 static inline void skb_queue_head_init_class(struct sk_buff_head *list, struct lock_class_key *class) { -diff --git a/net/core/dev.c b/net/core/dev.c -index c09d2190fefe..5809e1bf5287 100644 --- a/net/core/dev.c +++ b/net/core/dev.c -@@ -221,14 +221,14 @@ static inline struct hlist_head *dev_index_hash(struct net *net, int ifindex) +@@ -225,14 +225,14 @@ static inline struct hlist_head *dev_ind static inline void rps_lock(struct softnet_data *sd) { #ifdef CONFIG_RPS @@ -60,7 +59,7 @@ index c09d2190fefe..5809e1bf5287 100644 #endif } -@@ -11239,7 +11239,7 @@ static int __init net_dev_init(void) +@@ -11626,7 +11626,7 @@ static int __init net_dev_init(void) INIT_WORK(flush, flush_backlog); @@ -69,6 +68,3 @@ index c09d2190fefe..5809e1bf5287 100644 skb_queue_head_init(&sd->process_queue); #ifdef CONFIG_XFRM_OFFLOAD skb_queue_head_init(&sd->xfrm_backlog); --- -2.30.2 - diff --git a/debian/patches-rt/0243-net-dev-always-take-qdisc-s-busylock-in-__dev_xmit_s.patch b/debian/patches-rt/net__dev__always_take_qdiscs_busylock_in___dev_xmit_skb.patch index 5cda0bb52..7b4dd8573 100644 --- a/debian/patches-rt/0243-net-dev-always-take-qdisc-s-busylock-in-__dev_xmit_s.patch +++ b/debian/patches-rt/net__dev__always_take_qdiscs_busylock_in___dev_xmit_skb.patch @@ -1,9 +1,9 @@ -From 737ede0943382be86addc75a3dde34711613653c Mon Sep 17 00:00:00 2001 +Subject: net: dev: always take qdisc's busylock in __dev_xmit_skb() +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Wed Mar 30 13:36:29 2016 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Wed, 30 Mar 2016 13:36:29 +0200 -Subject: [PATCH 243/296] net: dev: always take qdisc's busylock in - __dev_xmit_skb() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz The root-lock is dropped before dev_hard_start_xmit() is invoked and after setting the __QDISC___STATE_RUNNING bit. If this task is now pushed away @@ -17,15 +17,16 @@ If we take always the busylock we ensure that the RT task can boost the low-prio task and submit the packet. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - net/core/dev.c | 4 ++++ + net/core/dev.c | 4 ++++ 1 file changed, 4 insertions(+) - -diff --git a/net/core/dev.c b/net/core/dev.c -index 22f3a4cea216..15bbb627c90d 100644 +--- --- a/net/core/dev.c +++ b/net/core/dev.c -@@ -3779,7 +3779,11 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q, +@@ -3825,7 +3825,11 @@ static inline int __dev_xmit_skb(struct * This permits qdisc->running owner to get the lock more * often and dequeue packets faster. */ @@ -37,6 +38,3 @@ index 22f3a4cea216..15bbb627c90d 100644 if (unlikely(contended)) spin_lock(&q->busylock); --- -2.30.2 - diff --git a/debian/patches-rt/0224-net-core-use-local_bh_disable-in-netif_rx_ni.patch b/debian/patches-rt/net_core__use_local_bh_disable_in_netif_rx_ni.patch index 4d238f478..be25e87ce 100644 --- a/debian/patches-rt/0224-net-core-use-local_bh_disable-in-netif_rx_ni.patch +++ b/debian/patches-rt/net_core__use_local_bh_disable_in_netif_rx_ni.patch @@ -1,8 +1,9 @@ -From 7e9270d27017d68d3ead462d80846449e18f603e Mon Sep 17 00:00:00 2001 +Subject: net/core: use local_bh_disable() in netif_rx_ni() +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Fri Jun 16 19:03:16 2017 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Fri, 16 Jun 2017 19:03:16 +0200 -Subject: [PATCH 224/296] net/core: use local_bh_disable() in netif_rx_ni() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz In 2004 netif_rx_ni() gained a preempt_disable() section around netif_rx() and its do_softirq() + testing for it. The do_softirq() part @@ -14,15 +15,16 @@ section. The local_bh_enable() part will invoke do_softirq() if required. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - net/core/dev.c | 6 ++---- + net/core/dev.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) - -diff --git a/net/core/dev.c b/net/core/dev.c -index 0b88c6154739..c09d2190fefe 100644 +--- --- a/net/core/dev.c +++ b/net/core/dev.c -@@ -4846,11 +4846,9 @@ int netif_rx_ni(struct sk_buff *skb) +@@ -4943,11 +4943,9 @@ int netif_rx_ni(struct sk_buff *skb) trace_netif_rx_ni_entry(skb); @@ -36,6 +38,3 @@ index 0b88c6154739..c09d2190fefe 100644 trace_netif_rx_ni_exit(err); return err; --- -2.30.2 - diff --git a/debian/patches-rt/0249-panic-skip-get_random_bytes-for-RT_FULL-in-init_oops.patch b/debian/patches-rt/panic__skip_get_random_bytes_for_RT_FULL_in_init_oops_id.patch index d9aa9e3b9..ba587779f 100644 --- a/debian/patches-rt/0249-panic-skip-get_random_bytes-for-RT_FULL-in-init_oops.patch +++ b/debian/patches-rt/panic__skip_get_random_bytes_for_RT_FULL_in_init_oops_id.patch @@ -1,23 +1,23 @@ -From e3babe886508ceb48a600aced5418369dd4d9390 Mon Sep 17 00:00:00 2001 +Subject: panic: skip get_random_bytes for RT_FULL in init_oops_id +From: Thomas Gleixner <tglx@linutronix.de> +Date: Tue Jul 14 14:26:34 2015 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 14 Jul 2015 14:26:34 +0200 -Subject: [PATCH 249/296] panic: skip get_random_bytes for RT_FULL in - init_oops_id -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz Disable on -RT. If this is invoked from irq-context we will have problems to acquire the sleeping lock. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - kernel/panic.c | 2 ++ + kernel/panic.c | 2 ++ 1 file changed, 2 insertions(+) - -diff --git a/kernel/panic.c b/kernel/panic.c -index 0efdac3cf94e..a14e2f5a9f55 100644 +--- --- a/kernel/panic.c +++ b/kernel/panic.c -@@ -544,9 +544,11 @@ static u64 oops_id; +@@ -545,9 +545,11 @@ static u64 oops_id; static int init_oops_id(void) { @@ -29,6 +29,3 @@ index 0efdac3cf94e..a14e2f5a9f55 100644 oops_id++; return 0; --- -2.30.2 - diff --git a/debian/patches-rt/powerpc__Add_support_for_lazy_preemption.patch b/debian/patches-rt/powerpc__Add_support_for_lazy_preemption.patch new file mode 100644 index 000000000..edc7fdc51 --- /dev/null +++ b/debian/patches-rt/powerpc__Add_support_for_lazy_preemption.patch @@ -0,0 +1,104 @@ +Subject: powerpc: Add support for lazy preemption +From: Thomas Gleixner <tglx@linutronix.de> +Date: Thu Nov 1 10:14:11 2012 +0100 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +From: Thomas Gleixner <tglx@linutronix.de> + +Implement the powerpc pieces for lazy preempt. + +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + +--- + arch/powerpc/Kconfig | 1 + + arch/powerpc/include/asm/thread_info.h | 7 +++++++ + arch/powerpc/kernel/interrupt.c | 8 ++++++-- + 3 files changed, 14 insertions(+), 2 deletions(-) +--- +--- a/arch/powerpc/Kconfig ++++ b/arch/powerpc/Kconfig +@@ -235,6 +235,7 @@ config PPC + select HAVE_PERF_EVENTS_NMI if PPC64 + select HAVE_PERF_REGS + select HAVE_PERF_USER_STACK_DUMP ++ select HAVE_PREEMPT_LAZY + select HAVE_REGS_AND_STACK_ACCESS_API + select HAVE_RELIABLE_STACKTRACE + select HAVE_RSEQ +--- a/arch/powerpc/include/asm/thread_info.h ++++ b/arch/powerpc/include/asm/thread_info.h +@@ -47,6 +47,8 @@ + struct thread_info { + int preempt_count; /* 0 => preemptable, + <0 => BUG */ ++ int preempt_lazy_count; /* 0 => preemptable, ++ <0 => BUG */ + unsigned long local_flags; /* private flags for thread */ + #ifdef CONFIG_LIVEPATCH + unsigned long *livepatch_sp; +@@ -93,6 +95,7 @@ void arch_setup_new_exec(void); + #define TIF_PATCH_PENDING 6 /* pending live patching update */ + #define TIF_SYSCALL_AUDIT 7 /* syscall auditing active */ + #define TIF_SINGLESTEP 8 /* singlestepping active */ ++#define TIF_NEED_RESCHED_LAZY 9 /* lazy rescheduling necessary */ + #define TIF_SECCOMP 10 /* secure computing */ + #define TIF_RESTOREALL 11 /* Restore all regs (implies NOERROR) */ + #define TIF_NOERROR 12 /* Force successful syscall return */ +@@ -108,6 +111,7 @@ void arch_setup_new_exec(void); + #define TIF_POLLING_NRFLAG 19 /* true if poll_idle() is polling TIF_NEED_RESCHED */ + #define TIF_32BIT 20 /* 32 bit binary */ + ++ + /* as above, but as bit values */ + #define _TIF_SYSCALL_TRACE (1<<TIF_SYSCALL_TRACE) + #define _TIF_SIGPENDING (1<<TIF_SIGPENDING) +@@ -119,6 +123,7 @@ void arch_setup_new_exec(void); + #define _TIF_PATCH_PENDING (1<<TIF_PATCH_PENDING) + #define _TIF_SYSCALL_AUDIT (1<<TIF_SYSCALL_AUDIT) + #define _TIF_SINGLESTEP (1<<TIF_SINGLESTEP) ++#define _TIF_NEED_RESCHED_LAZY (1<<TIF_NEED_RESCHED_LAZY) + #define _TIF_SECCOMP (1<<TIF_SECCOMP) + #define _TIF_RESTOREALL (1<<TIF_RESTOREALL) + #define _TIF_NOERROR (1<<TIF_NOERROR) +@@ -132,10 +137,12 @@ void arch_setup_new_exec(void); + _TIF_SYSCALL_EMU) + + #define _TIF_USER_WORK_MASK (_TIF_SIGPENDING | _TIF_NEED_RESCHED | \ ++ _TIF_NEED_RESCHED_LAZY | \ + _TIF_NOTIFY_RESUME | _TIF_UPROBE | \ + _TIF_RESTORE_TM | _TIF_PATCH_PENDING | \ + _TIF_NOTIFY_SIGNAL) + #define _TIF_PERSYSCALL_MASK (_TIF_RESTOREALL|_TIF_NOERROR) ++#define _TIF_NEED_RESCHED_MASK (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY) + + /* Bits in local_flags */ + /* Don't move TLF_NAPPING without adjusting the code in entry_32.S */ +--- a/arch/powerpc/kernel/interrupt.c ++++ b/arch/powerpc/kernel/interrupt.c +@@ -346,7 +346,7 @@ interrupt_exit_user_prepare_main(unsigne + ti_flags = READ_ONCE(current_thread_info()->flags); + while (unlikely(ti_flags & (_TIF_USER_WORK_MASK & ~_TIF_RESTORE_TM))) { + local_irq_enable(); +- if (ti_flags & _TIF_NEED_RESCHED) { ++ if (ti_flags & _TIF_NEED_RESCHED_MASK) { + schedule(); + } else { + /* +@@ -552,11 +552,15 @@ notrace unsigned long interrupt_exit_ker + /* Returning to a kernel context with local irqs enabled. */ + WARN_ON_ONCE(!(regs->msr & MSR_EE)); + again: +- if (IS_ENABLED(CONFIG_PREEMPT)) { ++ if (IS_ENABLED(CONFIG_PREEMPTION)) { + /* Return to preemptible kernel context */ + if (unlikely(current_thread_info()->flags & _TIF_NEED_RESCHED)) { + if (preempt_count() == 0) + preempt_schedule_irq(); ++ } else if (unlikely(current_thread_info()->flags & _TIF_NEED_RESCHED_LAZY)) { ++ if ((preempt_count() == 0) && ++ (current_thread_info()->preempt_lazy_count == 0)) ++ preempt_schedule_irq(); + } + } + diff --git a/debian/patches-rt/0282-powerpc-traps-Use-PREEMPT_RT.patch b/debian/patches-rt/powerpc__traps__Use_PREEMPT_RT.patch index 32b601830..882d0c6e2 100644 --- a/debian/patches-rt/0282-powerpc-traps-Use-PREEMPT_RT.patch +++ b/debian/patches-rt/powerpc__traps__Use_PREEMPT_RT.patch @@ -1,21 +1,23 @@ -From 6a65cebb2e708e5b2ac01a027eb501db810b3582 Mon Sep 17 00:00:00 2001 +Subject: powerpc: traps: Use PREEMPT_RT +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Fri Jul 26 11:30:49 2019 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Fri, 26 Jul 2019 11:30:49 +0200 -Subject: [PATCH 282/296] powerpc: traps: Use PREEMPT_RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz Add PREEMPT_RT to the backtrace if enabled. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - arch/powerpc/kernel/traps.c | 7 ++++++- + arch/powerpc/kernel/traps.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) - -diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c -index fc82a7670d37..34861e39d03a 100644 +--- --- a/arch/powerpc/kernel/traps.c +++ b/arch/powerpc/kernel/traps.c -@@ -259,12 +259,17 @@ static char *get_mmu_str(void) +@@ -260,12 +260,17 @@ static char *get_mmu_str(void) static int __die(const char *str, struct pt_regs *regs, long err) { @@ -34,6 +36,3 @@ index fc82a7670d37..34861e39d03a 100644 IS_ENABLED(CONFIG_SMP) ? " SMP" : "", IS_ENABLED(CONFIG_SMP) ? (" NR_CPUS=" __stringify(NR_CPUS)) : "", debug_pagealloc_enabled() ? " DEBUG_PAGEALLOC" : "", --- -2.30.2 - diff --git a/debian/patches-rt/0284-powerpc-kvm-Disable-in-kernel-MPIC-emulation-for-PRE.patch b/debian/patches-rt/powerpc_kvm__Disable_in-kernel_MPIC_emulation_for_PREEMPT_RT.patch index 77070971b..aa885707e 100644 --- a/debian/patches-rt/0284-powerpc-kvm-Disable-in-kernel-MPIC-emulation-for-PRE.patch +++ b/debian/patches-rt/powerpc_kvm__Disable_in-kernel_MPIC_emulation_for_PREEMPT_RT.patch @@ -1,9 +1,9 @@ -From 5f3ba6ceaf8c5a10f1704c11dd9d1ee8f5271abd Mon Sep 17 00:00:00 2001 +Subject: powerpc/kvm: Disable in-kernel MPIC emulation for PREEMPT_RT +From: Bogdan Purcareata <bogdan.purcareata@freescale.com> +Date: Fri Apr 24 15:53:13 2015 +0000 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Bogdan Purcareata <bogdan.purcareata@freescale.com> -Date: Fri, 24 Apr 2015 15:53:13 +0000 -Subject: [PATCH 284/296] powerpc/kvm: Disable in-kernel MPIC emulation for - PREEMPT_RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz While converting the openpic emulation code to use a raw_spinlock_t enables guests to run on RT, there's still a performance issue. For interrupts sent in @@ -24,12 +24,13 @@ proper openpic emulation that would be better suited for RT. Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Bogdan Purcareata <bogdan.purcareata@freescale.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - arch/powerpc/kvm/Kconfig | 1 + + arch/powerpc/kvm/Kconfig | 1 + 1 file changed, 1 insertion(+) - -diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig -index 549591d9aaa2..efb5bfe93f70 100644 +--- --- a/arch/powerpc/kvm/Kconfig +++ b/arch/powerpc/kvm/Kconfig @@ -178,6 +178,7 @@ config KVM_E500MC @@ -40,6 +41,3 @@ index 549591d9aaa2..efb5bfe93f70 100644 select HAVE_KVM_IRQCHIP select HAVE_KVM_IRQFD select HAVE_KVM_IRQ_ROUTING --- -2.30.2 - diff --git a/debian/patches-rt/0283-powerpc-pseries-iommu-Use-a-locallock-instead-local_.patch b/debian/patches-rt/powerpc_pseries_iommu__Use_a_locallock_instead_local_irq_save.patch index 5bcf4ee09..7d9e7e5a9 100644 --- a/debian/patches-rt/0283-powerpc-pseries-iommu-Use-a-locallock-instead-local_.patch +++ b/debian/patches-rt/powerpc_pseries_iommu__Use_a_locallock_instead_local_irq_save.patch @@ -1,9 +1,9 @@ -From 2b6d8848b340ef6c59b9e482692727e32cde5ecc Mon Sep 17 00:00:00 2001 +Subject: powerpc/pseries/iommu: Use a locallock instead local_irq_save() +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Tue Mar 26 18:31:54 2019 +0100 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Tue, 26 Mar 2019 18:31:54 +0100 -Subject: [PATCH 283/296] powerpc/pseries/iommu: Use a locallock instead - local_irq_save() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz The locallock protects the per-CPU variable tce_page. The function attempts to allocate memory while tce_page is protected (by disabling @@ -13,12 +13,13 @@ Use local_irq_save() instead of local_irq_disable(). Cc: stable-rt@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - arch/powerpc/platforms/pseries/iommu.c | 31 +++++++++++++++++--------- + arch/powerpc/platforms/pseries/iommu.c | 31 ++++++++++++++++++++----------- 1 file changed, 20 insertions(+), 11 deletions(-) - -diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c -index e4198700ed1a..62bd38cd80d1 100644 +--- --- a/arch/powerpc/platforms/pseries/iommu.c +++ b/arch/powerpc/platforms/pseries/iommu.c @@ -24,6 +24,7 @@ @@ -29,7 +30,7 @@ index e4198700ed1a..62bd38cd80d1 100644 #include <asm/io.h> #include <asm/prom.h> #include <asm/rtas.h> -@@ -190,7 +191,13 @@ static int tce_build_pSeriesLP(unsigned long liobn, long tcenum, long tceshift, +@@ -195,7 +196,13 @@ static int tce_build_pSeriesLP(unsigned return ret; } @@ -44,7 +45,7 @@ index e4198700ed1a..62bd38cd80d1 100644 static int tce_buildmulti_pSeriesLP(struct iommu_table *tbl, long tcenum, long npages, unsigned long uaddr, -@@ -212,9 +219,10 @@ static int tce_buildmulti_pSeriesLP(struct iommu_table *tbl, long tcenum, +@@ -218,9 +225,10 @@ static int tce_buildmulti_pSeriesLP(stru direction, attrs); } @@ -57,22 +58,22 @@ index e4198700ed1a..62bd38cd80d1 100644 /* This is safe to do since interrupts are off when we're called * from iommu_alloc{,_sg}() -@@ -223,12 +231,12 @@ static int tce_buildmulti_pSeriesLP(struct iommu_table *tbl, long tcenum, +@@ -229,12 +237,12 @@ static int tce_buildmulti_pSeriesLP(stru tcep = (__be64 *)__get_free_page(GFP_ATOMIC); /* If allocation fails, fall back to the loop implementation */ if (!tcep) { - local_irq_restore(flags); + local_unlock_irqrestore(&tce_page.lock, flags); return tce_build_pSeriesLP(tbl->it_index, tcenum, - tbl->it_page_shift, + tceshift, npages, uaddr, direction, attrs); } - __this_cpu_write(tce_page, tcep); + __this_cpu_write(tce_page.page, tcep); } - rpn = __pa(uaddr) >> TCE_SHIFT; -@@ -258,7 +266,7 @@ static int tce_buildmulti_pSeriesLP(struct iommu_table *tbl, long tcenum, + rpn = __pa(uaddr) >> tceshift; +@@ -264,7 +272,7 @@ static int tce_buildmulti_pSeriesLP(stru tcenum += limit; } while (npages > 0 && !rc); @@ -81,7 +82,7 @@ index e4198700ed1a..62bd38cd80d1 100644 if (unlikely(rc == H_NOT_ENOUGH_RESOURCES)) { ret = (int)rc; -@@ -429,16 +437,17 @@ static int tce_setrange_multi_pSeriesLP(unsigned long start_pfn, +@@ -440,16 +448,17 @@ static int tce_setrange_multi_pSeriesLP( DMA_BIDIRECTIONAL, 0); } @@ -103,7 +104,7 @@ index e4198700ed1a..62bd38cd80d1 100644 } proto_tce = TCE_PCI_READ | TCE_PCI_WRITE; -@@ -481,7 +490,7 @@ static int tce_setrange_multi_pSeriesLP(unsigned long start_pfn, +@@ -492,7 +501,7 @@ static int tce_setrange_multi_pSeriesLP( /* error cleanup: caller will clear whole range */ @@ -112,6 +113,3 @@ index e4198700ed1a..62bd38cd80d1 100644 return rc; } --- -2.30.2 - diff --git a/debian/patches-rt/0285-powerpc-stackprotector-work-around-stack-guard-init-.patch b/debian/patches-rt/powerpc_stackprotector__work_around_stack-guard_init_from_atomic.patch index e27cdc6b4..42aacaf1d 100644 --- a/debian/patches-rt/0285-powerpc-stackprotector-work-around-stack-guard-init-.patch +++ b/debian/patches-rt/powerpc_stackprotector__work_around_stack-guard_init_from_atomic.patch @@ -1,9 +1,9 @@ -From 8f2f8163e3701da0b0b62f8c1eafdf175e1f8807 Mon Sep 17 00:00:00 2001 +Subject: powerpc/stackprotector: work around stack-guard init from atomic +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Tue Mar 26 18:31:29 2019 +0100 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Tue, 26 Mar 2019 18:31:29 +0100 -Subject: [PATCH 285/296] powerpc/stackprotector: work around stack-guard init - from atomic -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz This is invoked from the secondary CPU in atomic context. On x86 we use tsc instead. On Power we XOR it against mftb() so lets use stack address @@ -11,15 +11,16 @@ as the initial value. Cc: stable-rt@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - arch/powerpc/include/asm/stackprotector.h | 4 ++++ + arch/powerpc/include/asm/stackprotector.h | 4 ++++ 1 file changed, 4 insertions(+) - -diff --git a/arch/powerpc/include/asm/stackprotector.h b/arch/powerpc/include/asm/stackprotector.h -index 1c8460e23583..b1653c160bab 100644 +--- --- a/arch/powerpc/include/asm/stackprotector.h +++ b/arch/powerpc/include/asm/stackprotector.h -@@ -24,7 +24,11 @@ static __always_inline void boot_init_stack_canary(void) +@@ -24,7 +24,11 @@ static __always_inline void boot_init_st unsigned long canary; /* Try to get a semi random initial value. */ @@ -31,6 +32,3 @@ index 1c8460e23583..b1653c160bab 100644 canary ^= mftb(); canary ^= LINUX_VERSION_CODE; canary &= CANARY_MASK; --- -2.30.2 - diff --git a/debian/patches-rt/printk__Enhance_the_condition_check_of_msleep_in_pr_flush.patch b/debian/patches-rt/printk__Enhance_the_condition_check_of_msleep_in_pr_flush.patch new file mode 100644 index 000000000..29274c007 --- /dev/null +++ b/debian/patches-rt/printk__Enhance_the_condition_check_of_msleep_in_pr_flush.patch @@ -0,0 +1,41 @@ +Subject: printk: Enhance the condition check of msleep in pr_flush() +From: Chao Qin <chao.qin@intel.com> +Date: Mon Jul 19 10:26:50 2021 +0800 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +From: Chao Qin <chao.qin@intel.com> + +There is msleep in pr_flush(). If call WARN() in the early boot +stage such as in early_initcall, pr_flush() will run into msleep +when process scheduler is not ready yet. And then the system will +sleep forever. + +Before the system_state is SYSTEM_RUNNING, make sure DO NOT sleep +in pr_flush(). + +Fixes: c0b395bd0fe3("printk: add pr_flush()") +Signed-off-by: Chao Qin <chao.qin@intel.com> +Signed-off-by: Lili Li <lili.li@intel.com> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Reviewed-by: John Ogness <john.ogness@linutronix.de> +Signed-off-by: John Ogness <john.ogness@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Link: https://lore.kernel.org/lkml/20210719022649.3444072-1-chao.qin@intel.com + +--- + kernel/printk/printk.c | 4 +++- + 1 file changed, 3 insertions(+), 1 deletion(-) +--- +--- a/kernel/printk/printk.c ++++ b/kernel/printk/printk.c +@@ -3649,7 +3649,9 @@ bool pr_flush(int timeout_ms, bool reset + u64 diff; + u64 seq; + +- may_sleep = (preemptible() && !in_softirq()); ++ may_sleep = (preemptible() && ++ !in_softirq() && ++ system_state >= SYSTEM_RUNNING); + + seq = prb_next_seq(prb); + diff --git a/debian/patches-rt/0108-printk-add-console-handover.patch b/debian/patches-rt/printk__add_console_handover.patch index 3748daa94..c5776921f 100644 --- a/debian/patches-rt/0108-printk-add-console-handover.patch +++ b/debian/patches-rt/printk__add_console_handover.patch @@ -1,8 +1,9 @@ -From 8b0e42a94653d4061f0dcd88e6df9346814bd3a0 Mon Sep 17 00:00:00 2001 +Subject: printk: add console handover +From: John Ogness <john.ogness@linutronix.de> +Date: Mon Nov 30 01:42:09 2020 +0106 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: John Ogness <john.ogness@linutronix.de> -Date: Mon, 30 Nov 2020 01:42:09 +0106 -Subject: [PATCH 108/296] printk: add console handover -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz If earlyprintk is used, a boot console will print directly to the console immediately. The boot console will unregister itself as soon @@ -20,16 +21,18 @@ take over. Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/console.h | 1 + - kernel/printk/printk.c | 8 +++++++- - 2 files changed, 8 insertions(+), 1 deletion(-) +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -diff --git a/include/linux/console.h b/include/linux/console.h -index 8b97a3e8f1ec..e6ff51883dee 100644 + + +--- + include/linux/console.h | 1 + + kernel/printk/printk.c | 15 +++++++++++++-- + 2 files changed, 14 insertions(+), 2 deletions(-) +--- --- a/include/linux/console.h +++ b/include/linux/console.h -@@ -138,6 +138,7 @@ static inline int con_debug_leave(void) +@@ -143,6 +143,7 @@ static inline int con_debug_leave(void) #define CON_ANYTIME (16) /* Safe to call when cpu is offline */ #define CON_BRL (32) /* Used for a braille device */ #define CON_EXTENDED (64) /* Use the extended output format a la /dev/kmsg */ @@ -37,11 +40,9 @@ index 8b97a3e8f1ec..e6ff51883dee 100644 struct console { char name[16]; -diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c -index f5fd3e1671fe..18e3aabb2c5b 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c -@@ -1727,6 +1727,8 @@ static bool console_can_sync(struct console *con) +@@ -1746,6 +1746,8 @@ static bool console_may_sync(struct cons return false; if (con->write_atomic && kernel_sync_mode()) return true; @@ -50,16 +51,23 @@ index f5fd3e1671fe..18e3aabb2c5b 100644 if (con->write && (con->flags & CON_BOOT) && !con->thread) return true; return false; -@@ -1738,6 +1740,8 @@ static bool call_sync_console_driver(struct console *con, const char *text, size - return false; - if (con->write_atomic && kernel_sync_mode()) - con->write_atomic(con, text, text_len); -+ else if (con->write_atomic && (con->flags & CON_HANDOVER) && !con->thread) -+ con->write_atomic(con, text, text_len); - else if (con->write && (con->flags & CON_BOOT) && !con->thread) - con->write(con, text, text_len); - else -@@ -2823,8 +2827,10 @@ void register_console(struct console *newcon) +@@ -1761,7 +1763,14 @@ static bool call_sync_console_driver(str + return true; + } + +- if (con->write && (con->flags & CON_BOOT) && !con->thread) { ++ if (con->write_atomic && (con->flags & CON_HANDOVER) && !con->thread) { ++ if (console_trylock()) { ++ con->write_atomic(con, text, text_len); ++ console_unlock(); ++ return true; ++ } ++ ++ } else if (con->write && (con->flags & CON_BOOT) && !con->thread) { + if (console_trylock()) { + con->write(con, text, text_len); + console_unlock(); +@@ -2891,8 +2900,10 @@ void register_console(struct console *ne * the real console are the same physical device, it's annoying to * see the beginning boot messages twice */ @@ -71,6 +79,3 @@ index f5fd3e1671fe..18e3aabb2c5b 100644 /* * Put this console in the list - keep the --- -2.30.2 - diff --git a/debian/patches-rt/0109-printk-add-pr_flush.patch b/debian/patches-rt/printk__add_pr_flush.patch index c5161de97..11879f560 100644 --- a/debian/patches-rt/0109-printk-add-pr_flush.patch +++ b/debian/patches-rt/printk__add_pr_flush.patch @@ -1,8 +1,9 @@ -From 58f2825dbbd85782452af04478f2d6139133e8e9 Mon Sep 17 00:00:00 2001 +Subject: printk: add pr_flush() +From: John Ogness <john.ogness@linutronix.de> +Date: Mon Nov 30 01:42:10 2020 +0106 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: John Ogness <john.ogness@linutronix.de> -Date: Mon, 30 Nov 2020 01:42:10 +0106 -Subject: [PATCH 109/296] printk: add pr_flush() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz Provide a function to allow waiting for console printers to catch up to the latest logged message. @@ -13,32 +14,41 @@ pr_flush() is only used in the most common error paths: panic(), print_oops_end_marker(), report_bug(), kmsg_dump(). Signed-off-by: John Ogness <john.ogness@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/printk.h | 2 ++ - kernel/panic.c | 28 +++++++++------ - kernel/printk/printk.c | 79 ++++++++++++++++++++++++++++++++++++++++++ - lib/bug.c | 1 + - 4 files changed, 99 insertions(+), 11 deletions(-) +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -diff --git a/include/linux/printk.h b/include/linux/printk.h -index 153212445b68..7e4352467d83 100644 +--- + include/linux/printk.h | 7 ++++ + kernel/panic.c | 28 ++++++++++------ + kernel/printk/printk.c | 81 +++++++++++++++++++++++++++++++++++++++++++++++++ + lib/bug.c | 1 + 4 files changed, 106 insertions(+), 11 deletions(-) +--- --- a/include/linux/printk.h +++ b/include/linux/printk.h -@@ -481,6 +481,8 @@ extern int kptr_restrict; - no_printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__) - #endif +@@ -161,6 +161,8 @@ int vprintk(const char *fmt, va_list arg + asmlinkage __printf(1, 2) __cold + int _printk(const char *fmt, ...); +bool pr_flush(int timeout_ms, bool reset_on_progress); + /* - * ratelimited messages with local ratelimit_state, - * no local ratelimit_state used in the !PRINTK case -diff --git a/kernel/panic.c b/kernel/panic.c -index 1f0df42f8d0c..0efdac3cf94e 100644 + * Please don't use printk_ratelimit(), because it shares ratelimiting state + * with all other unrelated printk_ratelimit() callsites. Instead use +@@ -201,6 +203,11 @@ int _printk(const char *s, ...) + return 0; + } + ++static inline bool pr_flush(int timeout_ms, bool reset_on_progress) ++{ ++ return true; ++} ++ + static inline int printk_ratelimit(void) + { + return 0; --- a/kernel/panic.c +++ b/kernel/panic.c -@@ -177,12 +177,28 @@ static void panic_print_sys_info(void) +@@ -178,12 +178,28 @@ static void panic_print_sys_info(void) void panic(const char *fmt, ...) { static char buf[1024]; @@ -67,7 +77,7 @@ index 1f0df42f8d0c..0efdac3cf94e 100644 /* * Disable local interrupts. This will prevent panic_smp_self_stop * from deadlocking the first cpu that invokes the panic, since -@@ -213,24 +229,13 @@ void panic(const char *fmt, ...) +@@ -214,24 +230,13 @@ void panic(const char *fmt, ...) if (old_cpu != PANIC_CPU_INVALID && old_cpu != this_cpu) panic_smp_self_stop(); @@ -92,7 +102,7 @@ index 1f0df42f8d0c..0efdac3cf94e 100644 /* * If kgdb is enabled, give it a chance to run before we stop all * the other CPUs or else we won't be able to debug processes left -@@ -552,6 +557,7 @@ static void print_oops_end_marker(void) +@@ -553,6 +558,7 @@ static void print_oops_end_marker(void) { init_oops_id(); pr_warn("---[ end trace %016llx ]---\n", (unsigned long long)oops_id); @@ -100,11 +110,9 @@ index 1f0df42f8d0c..0efdac3cf94e 100644 } /* -diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c -index 18e3aabb2c5b..f56fd2e34cc7 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c -@@ -3224,6 +3224,12 @@ void kmsg_dump(enum kmsg_dump_reason reason) +@@ -3285,6 +3285,12 @@ void kmsg_dump(enum kmsg_dump_reason rea sync_mode = true; pr_info("enabled sync mode\n"); } @@ -117,11 +125,12 @@ index 18e3aabb2c5b..f56fd2e34cc7 100644 } rcu_read_lock(); -@@ -3503,3 +3509,76 @@ void console_atomic_unlock(unsigned int flags) - prb_unlock(&printk_cpulock, flags); +@@ -3606,3 +3612,78 @@ bool kgdb_roundup_delay(unsigned int cpu } - EXPORT_SYMBOL(console_atomic_unlock); + EXPORT_SYMBOL(kgdb_roundup_delay); + #endif /* CONFIG_SMP */ + ++#ifdef CONFIG_PRINTK +static void pr_msleep(bool may_sleep, int ms) +{ + if (may_sleep) { @@ -167,7 +176,7 @@ index 18e3aabb2c5b..f56fd2e34cc7 100644 + for_each_console(con) { + if (!(con->flags & CON_ENABLED)) + continue; -+ printk_seq = atomic64_read(&con->printk_seq); ++ printk_seq = read_console_seq(con); + if (printk_seq < seq) + diff += seq - printk_seq; + } @@ -175,7 +184,7 @@ index 18e3aabb2c5b..f56fd2e34cc7 100644 + if (diff != last_diff && reset_on_progress) + remaining = timeout_ms; + -+ if (!diff || remaining == 0) ++ if (diff == 0 || remaining == 0) + break; + + if (remaining < 0) { @@ -194,11 +203,10 @@ index 18e3aabb2c5b..f56fd2e34cc7 100644 + return (diff == 0); +} +EXPORT_SYMBOL(pr_flush); -diff --git a/lib/bug.c b/lib/bug.c -index 7103440c0ee1..baf61c307a6a 100644 ++#endif /* CONFIG_PRINTK */ --- a/lib/bug.c +++ b/lib/bug.c -@@ -205,6 +205,7 @@ enum bug_trap_type report_bug(unsigned long bugaddr, struct pt_regs *regs) +@@ -206,6 +206,7 @@ enum bug_trap_type report_bug(unsigned l else pr_crit("Kernel BUG at %pB [verbose debug info unavailable]\n", (void *)bugaddr); @@ -206,6 +214,3 @@ index 7103440c0ee1..baf61c307a6a 100644 return BUG_TRAP_TYPE_BUG; } --- -2.30.2 - diff --git a/debian/patches-rt/0103-printk-combine-boot_delay_msec-into-printk_delay.patch b/debian/patches-rt/printk__call_boot_delay_msec_in_printk_delay.patch index 6be45a3ea..42afe9614 100644 --- a/debian/patches-rt/0103-printk-combine-boot_delay_msec-into-printk_delay.patch +++ b/debian/patches-rt/printk__call_boot_delay_msec_in_printk_delay.patch @@ -1,23 +1,24 @@ -From 59b15459085c4d0c44c9e579bc6b87b1dec1c3f4 Mon Sep 17 00:00:00 2001 +Subject: printk: call boot_delay_msec() in printk_delay() +From: John Ogness <john.ogness@linutronix.de> +Date: Mon Nov 30 01:42:04 2020 +0106 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: John Ogness <john.ogness@linutronix.de> -Date: Mon, 30 Nov 2020 01:42:04 +0106 -Subject: [PATCH 103/296] printk: combine boot_delay_msec() into printk_delay() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz boot_delay_msec() is always called immediately before printk_delay() -so just combine the two. +so just call it from within printk_delay(). Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + --- - kernel/printk/printk.c | 7 ++++--- + kernel/printk/printk.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) - -diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c -index 157417654b65..0333d11966ac 100644 +--- --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c -@@ -1728,8 +1728,10 @@ SYSCALL_DEFINE3(syslog, int, type, char __user *, buf, int, len) +@@ -1750,8 +1750,10 @@ SYSCALL_DEFINE3(syslog, int, type, char int printk_delay_msec __read_mostly; @@ -29,7 +30,7 @@ index 157417654b65..0333d11966ac 100644 if (unlikely(printk_delay_msec)) { int m = printk_delay_msec; -@@ -2187,8 +2189,7 @@ asmlinkage int vprintk_emit(int facility, int level, +@@ -2223,8 +2225,7 @@ asmlinkage int vprintk_emit(int facility in_sched = true; } @@ -39,6 +40,3 @@ index 157417654b65..0333d11966ac 100644 printed_len = vprintk_store(facility, level, dev_info, fmt, args); --- -2.30.2 - diff --git a/debian/patches-rt/0105-printk-introduce-kernel-sync-mode.patch b/debian/patches-rt/printk__introduce_kernel_sync_mode.patch index ac310af87..683f1ef4a 100644 --- a/debian/patches-rt/0105-printk-introduce-kernel-sync-mode.patch +++ b/debian/patches-rt/printk__introduce_kernel_sync_mode.patch @@ -1,8 +1,9 @@ -From 137e3c0d841608ac0c021684801ef32311069300 Mon Sep 17 00:00:00 2001 +Subject: printk: introduce kernel sync mode +From: John Ogness <john.ogness@linutronix.de> +Date: Mon Nov 30 01:42:06 2020 +0106 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: John Ogness <john.ogness@linutronix.de> -Date: Mon, 30 Nov 2020 01:42:06 +0106 -Subject: [PATCH 105/296] printk: introduce kernel sync mode -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz When the kernel performs an OOPS, enter into "sync mode": @@ -13,15 +14,14 @@ CONSOLE_LOG_MAX is moved to printk.h to support the per-console buffer used in sync mode. Signed-off-by: John Ogness <john.ogness@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/console.h | 4 ++ - include/linux/printk.h | 6 ++ - kernel/printk/printk.c | 133 ++++++++++++++++++++++++++++++++++++++-- - 3 files changed, 137 insertions(+), 6 deletions(-) +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -diff --git a/include/linux/console.h b/include/linux/console.h -index 46c27780ea39..d0109ec4e78b 100644 +--- + include/linux/console.h | 4 + + include/linux/printk.h | 6 + + kernel/printk/printk.c | 178 +++++++++++++++++++++++++++++++++++++++++++++--- + 3 files changed, 178 insertions(+), 10 deletions(-) +--- --- a/include/linux/console.h +++ b/include/linux/console.h @@ -16,6 +16,7 @@ @@ -32,21 +32,19 @@ index 46c27780ea39..d0109ec4e78b 100644 struct vc_data; struct console_font_op; -@@ -151,6 +152,9 @@ struct console { +@@ -150,6 +151,9 @@ struct console { short flags; short index; int cflag; +#ifdef CONFIG_PRINTK + char sync_buf[CONSOLE_LOG_MAX]; +#endif + uint ispeed; + uint ospeed; void *data; - struct console *next; - }; -diff --git a/include/linux/printk.h b/include/linux/printk.h -index 2476796c1150..1ebd93581acc 100644 --- a/include/linux/printk.h +++ b/include/linux/printk.h -@@ -46,6 +46,12 @@ static inline const char *printk_skip_headers(const char *buffer) +@@ -47,6 +47,12 @@ static inline const char *printk_skip_he #define CONSOLE_EXT_LOG_MAX 8192 @@ -54,25 +52,23 @@ index 2476796c1150..1ebd93581acc 100644 + * The maximum size of a record formatted for console printing + * (i.e. with the prefix prepended to every line). + */ -+#define CONSOLE_LOG_MAX 4096 ++#define CONSOLE_LOG_MAX 1024 + /* printk's without a loglevel use this.. */ #define MESSAGE_LOGLEVEL_DEFAULT CONFIG_MESSAGE_LOGLEVEL_DEFAULT -diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c -index a3763f25cede..1f225b32fcbd 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c -@@ -44,6 +44,7 @@ - #include <linux/irq_work.h> +@@ -45,6 +45,7 @@ #include <linux/ctype.h> #include <linux/uio.h> + #include <linux/kgdb.h> +#include <linux/clocksource.h> #include <linux/sched/clock.h> #include <linux/sched/debug.h> #include <linux/sched/task_stack.h> -@@ -359,6 +360,9 @@ enum log_flags { - static DEFINE_SPINLOCK(syslog_lock); +@@ -355,6 +356,9 @@ static int console_msg_format = MSG_FORM + static DEFINE_MUTEX(syslog_lock); #ifdef CONFIG_PRINTK +/* Set to enable sync mode. Once set, it is never cleared. */ @@ -81,17 +77,38 @@ index a3763f25cede..1f225b32fcbd 100644 DECLARE_WAIT_QUEUE_HEAD(log_wait); /* All 3 protected by @syslog_lock. */ /* the next printk record to read by syslog(READ) or /proc/kmsg */ -@@ -398,9 +402,6 @@ static struct latched_seq clear_seq = { - /* the maximum size allowed to be reserved for a record */ - #define LOG_LINE_MAX (1024 - PREFIX_MAX) +@@ -382,6 +386,20 @@ static struct latched_seq console_seq = + .val[1] = 0, + }; + ++static struct latched_seq console_sync_seq = { ++ .latch = SEQCNT_LATCH_ZERO(console_sync_seq.latch), ++ .val[0] = 0, ++ .val[1] = 0, ++}; ++ ++#ifdef CONFIG_HAVE_NMI ++static struct latched_seq console_sync_nmi_seq = { ++ .latch = SEQCNT_LATCH_ZERO(console_sync_nmi_seq.latch), ++ .val[0] = 0, ++ .val[1] = 0, ++}; ++#endif ++ + /* + * The next printk record to read after the last 'clear' command. There are + * two copies (updated with seqcount_latch) so that reads can locklessly +@@ -399,9 +417,6 @@ static struct latched_seq clear_seq = { + #define PREFIX_MAX 32 + #endif -/* the maximum size of a formatted record (i.e. with prefix added per line) */ --#define CONSOLE_LOG_MAX 4096 +-#define CONSOLE_LOG_MAX 1024 - - #define LOG_LEVEL(v) ((v) & 0x07) - #define LOG_FACILITY(v) ((v) >> 3 & 0xff) + /* the maximum size allowed to be reserved for a record */ + #define LOG_LINE_MAX (CONSOLE_LOG_MAX - PREFIX_MAX) -@@ -1743,6 +1744,91 @@ static inline void printk_delay(int level) +@@ -1773,6 +1788,116 @@ static inline void printk_delay(int leve } } @@ -100,7 +117,7 @@ index a3763f25cede..1f225b32fcbd 100644 + return (oops_in_progress || sync_mode); +} + -+static bool console_can_sync(struct console *con) ++static bool console_may_sync(struct console *con) +{ + if (!(con->flags & CON_ENABLED)) + return false; @@ -163,27 +180,52 @@ index a3763f25cede..1f225b32fcbd 100644 + return true; +} + ++static u64 read_console_seq(void) ++{ ++ u64 seq2; ++ u64 seq; ++ ++ seq = latched_seq_read_nolock(&console_seq); ++ seq2 = latched_seq_read_nolock(&console_sync_seq); ++ if (seq2 > seq) ++ seq = seq2; ++#ifdef CONFIG_HAVE_NMI ++ seq2 = latched_seq_read_nolock(&console_sync_nmi_seq); ++ if (seq2 > seq) ++ seq = seq2; ++#endif ++ return seq; ++} ++ +static void print_sync_until(struct console *con, u64 seq) +{ -+ unsigned int flags; + u64 printk_seq; + -+ console_atomic_lock(&flags); ++ while (!__printk_cpu_trylock()) ++ cpu_relax(); ++ + for (;;) { -+ printk_seq = atomic64_read(&console_seq); ++ printk_seq = read_console_seq(); + if (printk_seq >= seq) + break; + if (!print_sync(con, &printk_seq)) + break; -+ atomic64_set(&console_seq, printk_seq + 1); ++#ifdef CONFIG_PRINTK_NMI ++ if (in_nmi()) { ++ latched_seq_write(&console_sync_nmi_seq, printk_seq + 1); ++ continue; ++ } ++#endif ++ latched_seq_write(&console_sync_seq, printk_seq + 1); + } -+ console_atomic_unlock(flags); ++ ++ __printk_cpu_unlock(); +} + /* * Special console_lock variants that help to reduce the risk of soft-lockups. * They allow to pass console_lock to another printk() call using a busy wait. -@@ -1917,6 +2003,8 @@ static void call_console_drivers(const char *ext_text, size_t ext_len, +@@ -1947,6 +2072,8 @@ static void call_console_drivers(const c if (!cpu_online(smp_processor_id()) && !(con->flags & CON_ANYTIME)) continue; @@ -192,15 +234,15 @@ index a3763f25cede..1f225b32fcbd 100644 if (con->flags & CON_EXTENDED) con->write(con, ext_text, ext_len); else { -@@ -2071,6 +2159,7 @@ int vprintk_store(int facility, int level, +@@ -2114,6 +2241,7 @@ int vprintk_store(int facility, int leve const u32 caller_id = printk_caller_id(); struct prb_reserved_entry e; - enum log_flags lflags = 0; + enum printk_info_flags flags = 0; + bool final_commit = false; struct printk_record r; unsigned long irqflags; u16 trunc_msg_len = 0; -@@ -2080,6 +2169,7 @@ int vprintk_store(int facility, int level, +@@ -2124,6 +2252,7 @@ int vprintk_store(int facility, int leve u16 text_len; int ret = 0; u64 ts_nsec; @@ -208,37 +250,36 @@ index a3763f25cede..1f225b32fcbd 100644 /* * Since the duration of printk() can vary depending on the message -@@ -2118,6 +2208,7 @@ int vprintk_store(int facility, int level, - if (lflags & LOG_CONT) { +@@ -2162,6 +2291,7 @@ int vprintk_store(int facility, int leve + if (flags & LOG_CONT) { prb_rec_init_wr(&r, reserve_size); if (prb_reserve_in_last(&e, prb, &r, caller_id, LOG_LINE_MAX)) { + seq = r.info->seq; text_len = printk_sprint(&r.text_buf[r.info->text_len], reserve_size, - facility, &lflags, fmt, args); + facility, &flags, fmt, args); r.info->text_len += text_len; -@@ -2125,6 +2216,7 @@ int vprintk_store(int facility, int level, - if (lflags & LOG_NEWLINE) { +@@ -2169,6 +2299,7 @@ int vprintk_store(int facility, int leve + if (flags & LOG_NEWLINE) { r.info->flags |= LOG_NEWLINE; prb_final_commit(&e); + final_commit = true; } else { prb_commit(&e); } -@@ -2149,6 +2241,8 @@ int vprintk_store(int facility, int level, +@@ -2192,6 +2323,7 @@ int vprintk_store(int facility, int leve + if (!prb_reserve(&e, prb, &r)) goto out; } - + seq = r.info->seq; -+ + /* fill message */ - text_len = printk_sprint(&r.text_buf[0], reserve_size, facility, &lflags, fmt, args); - if (trunc_msg_len) -@@ -2163,13 +2257,25 @@ int vprintk_store(int facility, int level, + text_len = printk_sprint(&r.text_buf[0], reserve_size, facility, &flags, fmt, args); +@@ -2207,13 +2339,25 @@ int vprintk_store(int facility, int leve memcpy(&r.info->dev_info, dev_info, sizeof(r.info->dev_info)); /* A message without a trailing newline can be continued. */ -- if (!(lflags & LOG_NEWLINE)) -+ if (!(lflags & LOG_NEWLINE)) { +- if (!(flags & LOG_NEWLINE)) ++ if (!(flags & LOG_NEWLINE)) { prb_commit(&e); - else + } else { @@ -253,15 +294,15 @@ index a3763f25cede..1f225b32fcbd 100644 + struct console *con; + + for_each_console(con) { -+ if (console_can_sync(con)) ++ if (console_may_sync(con)) + print_sync_until(con, seq + 1); + } + } + - printk_exit_irqrestore(irqflags); + printk_exit_irqrestore(recursion_ptr, irqflags); return ret; } -@@ -2265,12 +2371,13 @@ EXPORT_SYMBOL(printk); +@@ -2282,13 +2426,13 @@ EXPORT_SYMBOL(_printk); #else /* CONFIG_PRINTK */ @@ -270,13 +311,14 @@ index a3763f25cede..1f225b32fcbd 100644 #define prb_read_valid(rb, seq, r) false #define prb_first_valid_seq(rb) 0 +-#define latched_seq_read_nolock(seq) 0 ++#define read_console_seq() 0 + #define latched_seq_write(dst, src) ++#define kernel_sync_mode() false -+#define kernel_sync_mode() false -+ - static u64 syslog_seq; - static atomic64_t console_seq = ATOMIC64_INIT(0); static u64 exclusive_console_stop_seq; -@@ -2556,6 +2663,8 @@ static int have_callable_console(void) + static unsigned long console_dropped; +@@ -2592,6 +2736,8 @@ static int have_callable_console(void) */ static inline int can_use_console(void) { @@ -285,8 +327,35 @@ index a3763f25cede..1f225b32fcbd 100644 return cpu_online(raw_smp_processor_id()) || have_callable_console(); } -@@ -3370,6 +3479,18 @@ void kmsg_dump(enum kmsg_dump_reason reason) - struct kmsg_dumper_iter iter; +@@ -2661,7 +2807,7 @@ void console_unlock(void) + size_t len; + + skip: +- seq = latched_seq_read_nolock(&console_seq); ++ seq = read_console_seq(); + if (!prb_read_valid(prb, seq, &r)) + break; + +@@ -2741,7 +2887,7 @@ void console_unlock(void) + * there's a new owner and the console_unlock() from them will do the + * flush, no worries. + */ +- retry = prb_read_valid(prb, latched_seq_read_nolock(&console_seq), NULL); ++ retry = prb_read_valid(prb, read_console_seq(), NULL); + if (retry && console_trylock()) + goto again; + } +@@ -3041,7 +3187,7 @@ void register_console(struct console *ne + * ignores console_lock. + */ + exclusive_console = newcon; +- exclusive_console_stop_seq = latched_seq_read_nolock(&console_seq); ++ exclusive_console_stop_seq = read_console_seq(); + + /* Get a consistent copy of @syslog_seq. */ + mutex_lock(&syslog_lock); +@@ -3411,6 +3557,18 @@ void kmsg_dump(enum kmsg_dump_reason rea + { struct kmsg_dumper *dumper; + if (!oops_in_progress) { @@ -304,6 +373,3 @@ index a3763f25cede..1f225b32fcbd 100644 rcu_read_lock(); list_for_each_entry_rcu(dumper, &dump_list, list) { enum kmsg_dump_reason max_reason = dumper->max_reason; --- -2.30.2 - diff --git a/debian/patches-rt/0106-printk-move-console-printing-to-kthreads.patch b/debian/patches-rt/printk__move_console_printing_to_kthreads.patch index 575a11f53..ab6845670 100644 --- a/debian/patches-rt/0106-printk-move-console-printing-to-kthreads.patch +++ b/debian/patches-rt/printk__move_console_printing_to_kthreads.patch @@ -1,8 +1,9 @@ -From f71b62e905fd8d2573cf0094ec9c50e30cfbb836 Mon Sep 17 00:00:00 2001 +Subject: printk: move console printing to kthreads +From: John Ogness <john.ogness@linutronix.de> +Date: Mon Nov 30 01:42:07 2020 +0106 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: John Ogness <john.ogness@linutronix.de> -Date: Mon, 30 Nov 2020 01:42:07 +0106 -Subject: [PATCH 106/296] printk: move console printing to kthreads -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz Create a kthread for each console to perform console printing. Now all console printing is fully asynchronous except for the boot @@ -13,62 +14,78 @@ The console_lock() and console_unlock() functions now only do what their name says... locking and unlocking of the console. Signed-off-by: John Ogness <john.ogness@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> ---- - include/linux/console.h | 2 + - kernel/printk/printk.c | 625 ++++++++++++---------------------------- - 2 files changed, 186 insertions(+), 441 deletions(-) +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> -diff --git a/include/linux/console.h b/include/linux/console.h -index d0109ec4e78b..8b97a3e8f1ec 100644 +--- + include/linux/console.h | 13 + kernel/printk/printk.c | 715 ++++++++++++++---------------------------------- + 2 files changed, 236 insertions(+), 492 deletions(-) +--- --- a/include/linux/console.h +++ b/include/linux/console.h -@@ -155,6 +155,8 @@ struct console { +@@ -17,6 +17,12 @@ + #include <linux/atomic.h> + #include <linux/types.h> + #include <linux/printk.h> ++#include <linux/seqlock.h> ++ ++struct latched_seq { ++ seqcount_latch_t latch; ++ u64 val[2]; ++}; + + struct vc_data; + struct console_font_op; +@@ -153,7 +159,14 @@ + int cflag; #ifdef CONFIG_PRINTK char sync_buf[CONSOLE_LOG_MAX]; ++ struct latched_seq printk_seq; ++ struct latched_seq printk_sync_seq; ++#ifdef CONFIG_HAVE_NMI ++ struct latched_seq printk_sync_nmi_seq; #endif -+ atomic64_t printk_seq; ++#endif /* CONFIG_PRINTK */ ++ + struct task_struct *thread; + uint ispeed; + uint ospeed; void *data; - struct console *next; - }; -diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c -index 1f225b32fcbd..23c6e3992962 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c -@@ -44,6 +44,7 @@ - #include <linux/irq_work.h> +@@ -45,6 +45,7 @@ #include <linux/ctype.h> #include <linux/uio.h> + #include <linux/kgdb.h> +#include <linux/kthread.h> #include <linux/clocksource.h> #include <linux/sched/clock.h> #include <linux/sched/debug.h> -@@ -267,11 +268,6 @@ static void __up_console_sem(unsigned long ip) - */ +@@ -269,11 +270,6 @@ static int console_locked, console_suspended; --/* + /* - * If exclusive_console is non-NULL then only this console is to be printed to. - */ -static struct console *exclusive_console; - - /* +-/* * Array of consoles built from command line options (console=) */ -@@ -356,10 +352,10 @@ enum log_flags { - LOG_CONT = 8, /* text is a fragment of a continuation line */ - }; + +@@ -352,10 +348,10 @@ + * non-prinatable characters are escaped in the "\xff" notation. + */ +#ifdef CONFIG_PRINTK /* syslog_lock protects syslog_* variables and write access to clear_seq. */ - static DEFINE_SPINLOCK(syslog_lock); + static DEFINE_MUTEX(syslog_lock); -#ifdef CONFIG_PRINTK /* Set to enable sync mode. Once set, it is never cleared. */ static bool sync_mode; -@@ -370,13 +366,6 @@ static u64 syslog_seq; +@@ -366,40 +362,6 @@ static size_t syslog_partial; static bool syslog_time; @@ -76,13 +93,40 @@ index 1f225b32fcbd..23c6e3992962 100644 -static u64 exclusive_console_stop_seq; -static unsigned long console_dropped; - --/* the next printk record to write to the console */ --static atomic64_t console_seq = ATOMIC64_INIT(0); +-struct latched_seq { +- seqcount_latch_t latch; +- u64 val[2]; +-}; - - struct latched_seq { - seqcount_latch_t latch; - u64 val[2]; -@@ -1755,6 +1744,8 @@ static bool console_can_sync(struct console *con) +-/* +- * The next printk record to write to the console. There are two +- * copies (updated with seqcount_latch) so that reads can locklessly +- * access a valid value. Writers are synchronized by @console_sem. +- */ +-static struct latched_seq console_seq = { +- .latch = SEQCNT_LATCH_ZERO(console_seq.latch), +- .val[0] = 0, +- .val[1] = 0, +-}; +- +-static struct latched_seq console_sync_seq = { +- .latch = SEQCNT_LATCH_ZERO(console_sync_seq.latch), +- .val[0] = 0, +- .val[1] = 0, +-}; +- +-#ifdef CONFIG_HAVE_NMI +-static struct latched_seq console_sync_nmi_seq = { +- .latch = SEQCNT_LATCH_ZERO(console_sync_nmi_seq.latch), +- .val[0] = 0, +- .val[1] = 0, +-}; +-#endif +- + /* + * The next printk record to read after the last 'clear' command. There are + * two copies (updated with seqcount_latch) so that reads can locklessly +@@ -1799,6 +1761,8 @@ return false; if (con->write_atomic && kernel_sync_mode()) return true; @@ -91,32 +135,92 @@ index 1f225b32fcbd..23c6e3992962 100644 return false; } -@@ -1764,6 +1755,8 @@ static bool call_sync_console_driver(struct console *con, const char *text, size +@@ -1806,12 +1770,21 @@ + { + if (!(con->flags & CON_ENABLED)) return false; - if (con->write_atomic && kernel_sync_mode()) +- if (con->write_atomic && kernel_sync_mode()) ++ ++ if (con->write_atomic && kernel_sync_mode()) { con->write_atomic(con, text, text_len); -+ else if (con->write && (con->flags & CON_BOOT) && !con->thread) -+ con->write(con, text, text_len); - else - return false; +- else +- return false; ++ return true; ++ } -@@ -1819,202 +1812,16 @@ static void print_sync_until(struct console *con, u64 seq) +- return true; ++ if (con->write && (con->flags & CON_BOOT) && !con->thread) { ++ if (console_trylock()) { ++ con->write(con, text, text_len); ++ console_unlock(); ++ return true; ++ } ++ } ++ ++ return false; + } + + static bool have_atomic_console(void) +@@ -1856,24 +1829,24 @@ + return true; + } + +-static u64 read_console_seq(void) ++static u64 read_console_seq(struct console *con) + { + u64 seq2; + u64 seq; + +- seq = latched_seq_read_nolock(&console_seq); +- seq2 = latched_seq_read_nolock(&console_sync_seq); ++ seq = latched_seq_read_nolock(&con->printk_seq); ++ seq2 = latched_seq_read_nolock(&con->printk_sync_seq); + if (seq2 > seq) + seq = seq2; + #ifdef CONFIG_HAVE_NMI +- seq2 = latched_seq_read_nolock(&console_sync_nmi_seq); ++ seq2 = latched_seq_read_nolock(&con->printk_sync_nmi_seq); + if (seq2 > seq) + seq = seq2; + #endif + return seq; + } + +-static void print_sync_until(struct console *con, u64 seq) ++static void print_sync_until(struct console *con, u64 seq, bool is_locked) + { + u64 printk_seq; + +@@ -1881,210 +1854,26 @@ + cpu_relax(); - console_atomic_lock(&flags); for (;;) { -- printk_seq = atomic64_read(&console_seq); -+ printk_seq = atomic64_read(&con->printk_seq); +- printk_seq = read_console_seq(); ++ printk_seq = read_console_seq(con); if (printk_seq >= seq) break; if (!print_sync(con, &printk_seq)) break; -- atomic64_set(&console_seq, printk_seq + 1); -+ atomic64_set(&con->printk_seq, printk_seq + 1); ++ ++ if (is_locked) ++ latched_seq_write(&con->printk_seq, printk_seq + 1); + #ifdef CONFIG_PRINTK_NMI +- if (in_nmi()) { +- latched_seq_write(&console_sync_nmi_seq, printk_seq + 1); +- continue; +- } ++ else if (in_nmi()) ++ latched_seq_write(&con->printk_sync_nmi_seq, printk_seq + 1); + #endif +- latched_seq_write(&console_sync_seq, printk_seq + 1); ++ else ++ latched_seq_write(&con->printk_sync_seq, printk_seq + 1); } - console_atomic_unlock(flags); + + __printk_cpu_unlock(); } --/* + /* - * Special console_lock variants that help to reduce the risk of soft-lockups. - * They allow to pass console_lock to another printk() call using a busy wait. - */ @@ -302,10 +406,20 @@ index 1f225b32fcbd..23c6e3992962 100644 - } -} - - #ifdef CONFIG_PRINTK_NMI - #define NUM_RECURSION_CTX 2 - #else -@@ -2285,39 +2092,16 @@ asmlinkage int vprintk_emit(int facility, int level, +-/* + * Recursion is tracked separately on each CPU. If NMIs are supported, an + * additional NMI context per CPU is also separately tracked. Until per-CPU + * is available, a separate "early tracking" is performed. +@@ -2354,7 +2143,7 @@ + + for_each_console(con) { + if (console_may_sync(con)) +- print_sync_until(con, seq + 1); ++ print_sync_until(con, seq + 1, false); + } + } + +@@ -2367,39 +2156,16 @@ const char *fmt, va_list args) { int printed_len; @@ -346,9 +460,9 @@ index 1f225b32fcbd..23c6e3992962 100644 wake_up_klogd(); return printed_len; } -@@ -2369,38 +2153,158 @@ asmlinkage __visible int printk(const char *fmt, ...) +@@ -2424,37 +2190,162 @@ } - EXPORT_SYMBOL(printk); + EXPORT_SYMBOL(_printk); -#else /* CONFIG_PRINTK */ +static int printk_kthread_func(void *data) @@ -363,12 +477,10 @@ index 1f225b32fcbd..23c6e3992962 100644 + int ret = -ENOMEM; + char *text = NULL; + char *write_text; -+ u64 printk_seq; + size_t len; + int error; + u64 seq; - --#define printk_time false ++ + if (con->flags & CON_EXTENDED) { + ext_text = kmalloc(CONSOLE_EXT_LOG_MAX, GFP_KERNEL); + if (!ext_text) @@ -378,26 +490,18 @@ index 1f225b32fcbd..23c6e3992962 100644 + dropped_text = kmalloc(64, GFP_KERNEL); + if (!text || !dropped_text) + goto out; - --#define prb_read_valid(rb, seq, r) false --#define prb_first_valid_seq(rb) 0 + if (con->flags & CON_EXTENDED) + write_text = ext_text; + else + write_text = text; - --#define kernel_sync_mode() false -+ seq = atomic64_read(&con->printk_seq); - --static u64 syslog_seq; --static atomic64_t console_seq = ATOMIC64_INIT(0); --static u64 exclusive_console_stop_seq; --static unsigned long console_dropped; ++ ++ seq = read_console_seq(con); ++ + prb_rec_init_rd(&r, &info, text, LOG_LINE_MAX + PREFIX_MAX); + + for (;;) { + error = wait_event_interruptible(log_wait, -+ prb_read_valid(prb, seq, &r) || kthread_should_stop()); ++ prb_read_valid(prb, seq, &r) || kthread_should_stop()); + + if (kthread_should_stop()) + break; @@ -417,39 +521,54 @@ index 1f225b32fcbd..23c6e3992962 100644 + + if (suppress_message_printing(r.info->level)) + continue; -+ + +-#define printk_time false + if (con->flags & CON_EXTENDED) { + len = info_print_ext_header(ext_text, -+ CONSOLE_EXT_LOG_MAX, -+ r.info); ++ CONSOLE_EXT_LOG_MAX, ++ r.info); + len += msg_print_ext_body(ext_text + len, -+ CONSOLE_EXT_LOG_MAX - len, -+ &r.text_buf[0], r.info->text_len, -+ &r.info->dev_info); ++ CONSOLE_EXT_LOG_MAX - len, ++ &r.text_buf[0], r.info->text_len, ++ &r.info->dev_info); + } else { + len = record_print_text(&r, -+ console_msg_format & MSG_FORMAT_SYSLOG, -+ printk_time); ++ console_msg_format & MSG_FORMAT_SYSLOG, ++ printk_time); + } -+ -+ printk_seq = atomic64_read(&con->printk_seq); --static size_t record_print_text(const struct printk_record *r, -- bool syslog, bool time) +-#define prb_read_valid(rb, seq, r) false +-#define prb_first_valid_seq(rb) 0 +-#define read_console_seq() 0 +-#define latched_seq_write(dst, src) +-#define kernel_sync_mode() false + console_lock(); ++ ++ /* ++ * Even though the printk kthread is always preemptible, it is ++ * still not allowed to call cond_resched() from within ++ * console drivers. The task may become non-preemptible in the ++ * console driver call chain. For example, vt_console_print() ++ * takes a spinlock and then can call into fbcon_redraw(), ++ * which can conditionally invoke cond_resched(). ++ */ + console_may_schedule = 0; + + if (kernel_sync_mode() && con->write_atomic) { + console_unlock(); + break; + } -+ + +-static u64 exclusive_console_stop_seq; +-static unsigned long console_dropped; + if (!(con->flags & CON_EXTENDED) && dropped) { + dropped_len = snprintf(dropped_text, 64, + "** %lu printk messages dropped **\n", + dropped); + dropped = 0; -+ + +-static size_t record_print_text(const struct printk_record *r, +- bool syslog, bool time) + con->write(con, dropped_text, dropped_len); + printk_delay(r.info->level); + } @@ -458,10 +577,11 @@ index 1f225b32fcbd..23c6e3992962 100644 + if (len) + printk_delay(r.info->level); + -+ atomic64_cmpxchg_relaxed(&con->printk_seq, printk_seq, seq); ++ latched_seq_write(&con->printk_seq, seq); + + console_unlock(); + } ++ ret = 0; +out: + kfree(dropped_text); + kfree(text); @@ -480,8 +600,8 @@ index 1f225b32fcbd..23c6e3992962 100644 + "pr/%s%d", con->name, con->index); + if (IS_ERR(con->thread)) { + pr_err("%sconsole [%s%d]: unable to start printing thread\n", -+ (con->flags & CON_BOOT) ? "boot" : "", -+ con->name, con->index); ++ (con->flags & CON_BOOT) ? "boot" : "", ++ con->name, con->index); + return; + } + pr_info("%sconsole [%s%d]: printing thread started\n", @@ -507,8 +627,13 @@ index 1f225b32fcbd..23c6e3992962 100644 + * The printing threads have not been started yet. If this console + * can print synchronously, print all unprinted messages. + */ -+ if (console_can_sync(con)) -+ print_sync_until(con, prb_next_seq(prb)); ++ if (console_may_sync(con)) { ++ unsigned long flags; ++ ++ local_irq_save(flags); ++ print_sync_until(con, prb_next_seq(prb), true); ++ local_irq_restore(flags); ++ } } -static ssize_t msg_print_ext_body(char *buf, size_t size, - char *text, size_t text_len, @@ -518,17 +643,10 @@ index 1f225b32fcbd..23c6e3992962 100644 -static void call_console_drivers(const char *ext_text, size_t ext_len, - const char *text, size_t len) {} -static bool suppress_message_printing(int level) { return false; } -+ -+#else /* CONFIG_PRINTK */ -+ -+#define prb_first_valid_seq(rb) 0 -+#define prb_next_seq(rb) 0 -+ -+#define console_try_thread(con) #endif /* CONFIG_PRINTK */ -@@ -2638,36 +2542,6 @@ int is_console_locked(void) +@@ -2711,36 +2602,6 @@ } EXPORT_SYMBOL(is_console_locked); @@ -565,12 +683,13 @@ index 1f225b32fcbd..23c6e3992962 100644 /** * console_unlock - unlock the console system * -@@ -2684,131 +2558,14 @@ static inline int can_use_console(void) +@@ -2757,139 +2618,13 @@ */ void console_unlock(void) { - static char ext_text[CONSOLE_EXT_LOG_MAX]; - static char text[CONSOLE_LOG_MAX]; +- unsigned long flags; - bool do_cond_resched, retry; - struct printk_info info; - struct printk_record r; @@ -614,16 +733,17 @@ index 1f225b32fcbd..23c6e3992962 100644 - - for (;;) { - size_t ext_len = 0; +- int handover; - size_t len; - -skip: -- seq = atomic64_read(&console_seq); +- seq = read_console_seq(); - if (!prb_read_valid(prb, seq, &r)) - break; - - if (seq != r.info->seq) { - console_dropped += r.info->seq - seq; -- atomic64_set(&console_seq, r.info->seq); +- latched_seq_write(&console_seq, r.info->seq); - seq = r.info->seq; - } - @@ -633,7 +753,7 @@ index 1f225b32fcbd..23c6e3992962 100644 - * directly to the console when we received it, and - * record that has level above the console loglevel. - */ -- atomic64_set(&console_seq, seq + 1); +- latched_seq_write(&console_seq, seq + 1); - goto skip; - } - @@ -660,21 +780,28 @@ index 1f225b32fcbd..23c6e3992962 100644 - len = record_print_text(&r, - console_msg_format & MSG_FORMAT_SYSLOG, - printk_time); -- atomic64_set(&console_seq, seq + 1); +- latched_seq_write(&console_seq, seq + 1); - - /* - * While actively printing out messages, if another printk() - * were to occur on another CPU, it may wait for this one to - * finish. This task can not be preempted if there is a - * waiter waiting to take over. +- * +- * Interrupts are disabled because the hand over to a waiter +- * must not be interrupted until the hand over is completed +- * (@console_waiter is cleared). - */ +- printk_safe_enter_irqsave(flags); - console_lock_spinning_enable(); - - stop_critical_timings(); /* don't trace print latency */ - call_console_drivers(ext_text, ext_len, text, len); - start_critical_timings(); - -- if (console_lock_spinning_disable_and_check()) +- handover = console_lock_spinning_disable_and_check(); +- printk_safe_exit_irqrestore(flags); +- if (handover) - return; - - if (do_cond_resched) @@ -682,7 +809,6 @@ index 1f225b32fcbd..23c6e3992962 100644 - } - console_locked = 0; - up_console_sem(); - - /* @@ -691,52 +817,54 @@ index 1f225b32fcbd..23c6e3992962 100644 - * there's a new owner and the console_unlock() from them will do the - * flush, no worries. - */ -- retry = prb_read_valid(prb, atomic64_read(&console_seq), NULL); +- retry = prb_read_valid(prb, read_console_seq(), NULL); - if (retry && console_trylock()) - goto again; } EXPORT_SYMBOL(console_unlock); -@@ -2858,18 +2615,20 @@ void console_unblank(void) +@@ -2939,19 +2674,20 @@ */ void console_flush_on_panic(enum con_flush_mode mode) { -- /* -- * If someone else is holding the console lock, trylock will fail -- * and may_schedule may be set. Ignore and proceed to unlock so -- * that messages are flushed out. As this can be called from any -- * context and we don't want to get preempted while flushing, -- * ensure may_schedule is cleared. -- */ -- console_trylock(); -+ struct console *c; -+ u64 seq; -+ +- if (console_trylock()) { +- if (mode == CONSOLE_REPLAY_ALL) +- latched_seq_write(&console_seq, prb_first_valid_seq(prb)); +- } else { +- /* +- * Another context is holding the console lock and +- * @console_may_schedule may be set. Ignore and proceed to +- * unlock so that messages are flushed out. As this can be +- * called from any context and we don't want to get preempted +- * while flushing, ensure @console_may_schedule is cleared. +- */ +- console_may_schedule = 0; + if (!console_trylock()) + return; + - console_may_schedule = 0; - -- if (mode == CONSOLE_REPLAY_ALL) -- atomic64_set(&console_seq, prb_first_valid_seq(prb)); ++#ifdef CONFIG_PRINTK + if (mode == CONSOLE_REPLAY_ALL) { ++ struct console *c; ++ u64 seq; ++ + seq = prb_first_valid_seq(prb); + for_each_console(c) -+ atomic64_set(&c->printk_seq, seq); -+ } ++ latched_seq_write(&c->printk_seq, seq); + } ++#endif + console_unlock(); } -@@ -3004,7 +2763,6 @@ static int try_enable_new_console(struct console *newcon, bool user_specified) - */ +@@ -3087,6 +2823,7 @@ void register_console(struct console *newcon) { -- unsigned long flags; struct console *bcon = NULL; ++ u64 __maybe_unused seq = 0; int err; -@@ -3028,6 +2786,8 @@ void register_console(struct console *newcon) + for_each_console(bcon) { +@@ -3109,6 +2846,8 @@ } } @@ -745,7 +873,7 @@ index 1f225b32fcbd..23c6e3992962 100644 if (console_drivers && console_drivers->flags & CON_BOOT) bcon = console_drivers; -@@ -3092,27 +2852,12 @@ void register_console(struct console *newcon) +@@ -3173,27 +2912,21 @@ if (newcon->flags & CON_EXTENDED) nr_ext_console_drivers++; @@ -763,22 +891,31 @@ index 1f225b32fcbd..23c6e3992962 100644 - * ignores console_lock. - */ - exclusive_console = newcon; -- exclusive_console_stop_seq = atomic64_read(&console_seq); -+ if (newcon->flags & CON_PRINTBUFFER) -+ atomic64_set(&newcon->printk_seq, 0); -+ else -+ atomic64_set(&newcon->printk_seq, prb_next_seq(prb)); +- exclusive_console_stop_seq = read_console_seq(); ++#ifdef CONFIG_PRINTK ++ if (!(newcon->flags & CON_PRINTBUFFER)) ++ seq = prb_next_seq(prb); - /* Get a consistent copy of @syslog_seq. */ -- spin_lock_irqsave(&syslog_lock, flags); -- atomic64_set(&console_seq, syslog_seq); -- spin_unlock_irqrestore(&syslog_lock, flags); +- mutex_lock(&syslog_lock); +- latched_seq_write(&console_seq, syslog_seq); +- mutex_unlock(&syslog_lock); - } ++ seqcount_latch_init(&newcon->printk_seq.latch); ++ latched_seq_write(&newcon->printk_seq, seq); ++ seqcount_latch_init(&newcon->printk_sync_seq.latch); ++ latched_seq_write(&newcon->printk_sync_seq, seq); ++#ifdef CONFIG_HAVE_NMI ++ seqcount_latch_init(&newcon->printk_sync_nmi_seq.latch); ++ latched_seq_write(&newcon->printk_sync_nmi_seq, seq); ++#endif ++ + console_try_thread(newcon); ++#endif /* CONFIG_PRINTK */ console_unlock(); console_sysfs_notify(); -@@ -3186,6 +2931,9 @@ int unregister_console(struct console *console) +@@ -3267,6 +3000,9 @@ console_unlock(); console_sysfs_notify(); @@ -788,7 +925,7 @@ index 1f225b32fcbd..23c6e3992962 100644 if (console->exit) res = console->exit(console); -@@ -3268,6 +3016,15 @@ static int __init printk_late_init(void) +@@ -3349,6 +3085,15 @@ unregister_console(con); } } @@ -804,7 +941,7 @@ index 1f225b32fcbd..23c6e3992962 100644 ret = cpuhp_setup_state_nocalls(CPUHP_PRINTK_DEAD, "printk:dead", NULL, console_cpu_notify); WARN_ON(ret < 0); -@@ -3283,7 +3040,6 @@ late_initcall(printk_late_init); +@@ -3364,7 +3109,6 @@ * Delayed printk version, for scheduler-internal messages: */ #define PRINTK_PENDING_WAKEUP 0x01 @@ -812,7 +949,7 @@ index 1f225b32fcbd..23c6e3992962 100644 static DEFINE_PER_CPU(int, printk_pending); -@@ -3291,14 +3047,8 @@ static void wake_up_klogd_work_func(struct irq_work *irq_work) +@@ -3372,14 +3116,8 @@ { int pending = __this_cpu_xchg(printk_pending, 0); @@ -827,8 +964,8 @@ index 1f225b32fcbd..23c6e3992962 100644 + wake_up_interruptible_all(&log_wait); } - static DEFINE_PER_CPU(struct irq_work, wake_up_klogd_work) = { -@@ -3321,13 +3071,6 @@ void wake_up_klogd(void) + static DEFINE_PER_CPU(struct irq_work, wake_up_klogd_work) = +@@ -3400,13 +3138,6 @@ void defer_console_output(void) { @@ -841,7 +978,4 @@ index 1f225b32fcbd..23c6e3992962 100644 - preempt_enable(); } - int vprintk_deferred(const char *fmt, va_list args) --- -2.30.2 - + void printk_trigger_flush(void) diff --git a/debian/patches-rt/printk__relocate_printk_delay.patch b/debian/patches-rt/printk__relocate_printk_delay.patch new file mode 100644 index 000000000..b2209d31b --- /dev/null +++ b/debian/patches-rt/printk__relocate_printk_delay.patch @@ -0,0 +1,62 @@ +Subject: printk: relocate printk_delay() +From: John Ogness <john.ogness@linutronix.de> +Date: Mon Nov 30 01:42:03 2020 +0106 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +From: John Ogness <john.ogness@linutronix.de> + +Move printk_delay() "as is" further up so that they can be used by +new functions in an upcoming commit. + +Signed-off-by: John Ogness <john.ogness@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + +--- + kernel/printk/printk.c | 28 ++++++++++++++-------------- + 1 file changed, 14 insertions(+), 14 deletions(-) +--- +--- a/kernel/printk/printk.c ++++ b/kernel/printk/printk.c +@@ -1748,6 +1748,20 @@ SYSCALL_DEFINE3(syslog, int, type, char + return do_syslog(type, buf, len, SYSLOG_FROM_READER); + } + ++int printk_delay_msec __read_mostly; ++ ++static inline void printk_delay(void) ++{ ++ if (unlikely(printk_delay_msec)) { ++ int m = printk_delay_msec; ++ ++ while (m--) { ++ mdelay(1); ++ touch_nmi_watchdog(); ++ } ++ } ++} ++ + /* + * Special console_lock variants that help to reduce the risk of soft-lockups. + * They allow to pass console_lock to another printk() call using a busy wait. +@@ -2002,20 +2016,6 @@ static u8 *__printk_recursion_counter(vo + local_irq_restore(flags); \ + } while (0) + +-int printk_delay_msec __read_mostly; +- +-static inline void printk_delay(void) +-{ +- if (unlikely(printk_delay_msec)) { +- int m = printk_delay_msec; +- +- while (m--) { +- mdelay(1); +- touch_nmi_watchdog(); +- } +- } +-} +- + static inline u32 printk_caller_id(void) + { + return in_task() ? task_pid_nr(current) : diff --git a/debian/patches-rt/printk__remove_deferred_printing.patch b/debian/patches-rt/printk__remove_deferred_printing.patch new file mode 100644 index 000000000..f695693d2 --- /dev/null +++ b/debian/patches-rt/printk__remove_deferred_printing.patch @@ -0,0 +1,826 @@ +Subject: printk: remove deferred printing +From: John Ogness <john.ogness@linutronix.de> +Date: Mon Nov 30 01:42:08 2020 +0106 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +From: John Ogness <john.ogness@linutronix.de> + +Since printing occurs either atomically or from the printing +kthread, there is no need for any deferring or tracking possible +recursion paths. Remove all printk defer functions and context +tracking. + +Signed-off-by: John Ogness <john.ogness@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + +--- + arch/arm/kernel/smp.c | 2 - + arch/powerpc/kexec/crash.c | 3 - + arch/x86/kernel/dumpstack_32.c | 2 - + arch/x86/kernel/dumpstack_64.c | 3 + + arch/x86/kernel/i8259.c | 3 - + arch/x86/kernel/unwind_frame.c | 16 +++----- + arch/x86/kernel/unwind_orc.c | 2 - + drivers/char/random.c | 5 +- + include/linux/printk.h | 34 ------------------ + include/linux/suspend.h | 10 +---- + kernel/power/main.c | 10 +---- + kernel/printk/Makefile | 1 + kernel/printk/internal.h | 36 ------------------- + kernel/printk/printk.c | 74 +++++++++++++--------------------------- + kernel/printk/printk_safe.c | 52 ---------------------------- + kernel/sched/core.c | 9 ++-- + kernel/sched/deadline.c | 2 - + kernel/sched/fair.c | 5 -- + kernel/sched/psi.c | 14 +++---- + kernel/sched/rt.c | 2 - + kernel/time/clockevents.c | 9 +--- + kernel/time/ntp.c | 14 ++----- + kernel/time/timekeeping.c | 30 ++++++++-------- + kernel/time/timekeeping_debug.c | 2 - + kernel/workqueue.c | 4 -- + lib/ratelimit.c | 4 -- + 26 files changed, 83 insertions(+), 265 deletions(-) + delete mode 100644 kernel/printk/internal.h + delete mode 100644 kernel/printk/printk_safe.c +--- +--- a/arch/arm/kernel/smp.c ++++ b/arch/arm/kernel/smp.c +@@ -667,9 +667,7 @@ + break; + + case IPI_CPU_BACKTRACE: +- printk_deferred_enter(); + nmi_cpu_backtrace(get_irq_regs()); +- printk_deferred_exit(); + break; + + default: +--- a/arch/powerpc/kexec/crash.c ++++ b/arch/powerpc/kexec/crash.c +@@ -312,9 +312,6 @@ + unsigned int i; + int (*old_handler)(struct pt_regs *regs); + +- /* Avoid hardlocking with irresponsive CPU holding logbuf_lock */ +- printk_deferred_enter(); +- + /* + * This function is only called after the system + * has panicked or is otherwise in a critical state. +--- a/arch/x86/kernel/dumpstack_32.c ++++ b/arch/x86/kernel/dumpstack_32.c +@@ -141,7 +141,7 @@ + */ + if (visit_mask) { + if (*visit_mask & (1UL << info->type)) { +- printk_deferred_once(KERN_WARNING "WARNING: stack recursion on stack type %d\n", info->type); ++ pr_warn_once("WARNING: stack recursion on stack type %d\n", info->type); + goto unknown; + } + *visit_mask |= 1UL << info->type; +--- a/arch/x86/kernel/dumpstack_64.c ++++ b/arch/x86/kernel/dumpstack_64.c +@@ -207,7 +207,8 @@ + if (visit_mask) { + if (*visit_mask & (1UL << info->type)) { + if (task == current) +- printk_deferred_once(KERN_WARNING "WARNING: stack recursion on stack type %d\n", info->type); ++ pr_warn_once("WARNING: stack recursion on stack type %d\n", ++ info->type); + goto unknown; + } + *visit_mask |= 1UL << info->type; +--- a/arch/x86/kernel/i8259.c ++++ b/arch/x86/kernel/i8259.c +@@ -207,8 +207,7 @@ + * lets ACK and report it. [once per IRQ] + */ + if (!(spurious_irq_mask & irqmask)) { +- printk_deferred(KERN_DEBUG +- "spurious 8259A interrupt: IRQ%d.\n", irq); ++ printk(KERN_DEBUG "spurious 8259A interrupt: IRQ%d.\n", irq); + spurious_irq_mask |= irqmask; + } + atomic_inc(&irq_err_count); +--- a/arch/x86/kernel/unwind_frame.c ++++ b/arch/x86/kernel/unwind_frame.c +@@ -41,9 +41,9 @@ + + dumped_before = true; + +- printk_deferred("unwind stack type:%d next_sp:%p mask:0x%lx graph_idx:%d\n", +- state->stack_info.type, state->stack_info.next_sp, +- state->stack_mask, state->graph_idx); ++ printk("unwind stack type:%d next_sp:%p mask:0x%lx graph_idx:%d\n", ++ state->stack_info.type, state->stack_info.next_sp, ++ state->stack_mask, state->graph_idx); + + for (sp = PTR_ALIGN(state->orig_sp, sizeof(long)); sp; + sp = PTR_ALIGN(stack_info.next_sp, sizeof(long))) { +@@ -59,13 +59,11 @@ + + if (zero) { + if (!prev_zero) +- printk_deferred("%p: %0*x ...\n", +- sp, BITS_PER_LONG/4, 0); ++ printk("%p: %0*x ...\n", sp, BITS_PER_LONG/4, 0); + continue; + } + +- printk_deferred("%p: %0*lx (%pB)\n", +- sp, BITS_PER_LONG/4, word, (void *)word); ++ printk("%p: %0*lx (%pB)\n", sp, BITS_PER_LONG/4, word, (void *)word); + } + } + } +@@ -342,13 +340,13 @@ + goto the_end; + + if (state->regs) { +- printk_deferred_once(KERN_WARNING ++ pr_warn_once( + "WARNING: kernel stack regs at %p in %s:%d has bad 'bp' value %p\n", + state->regs, state->task->comm, + state->task->pid, next_bp); + unwind_dump(state); + } else { +- printk_deferred_once(KERN_WARNING ++ pr_warn_once( + "WARNING: kernel stack frame pointer at %p in %s:%d has bad value %p\n", + state->bp, state->task->comm, + state->task->pid, next_bp); +--- a/arch/x86/kernel/unwind_orc.c ++++ b/arch/x86/kernel/unwind_orc.c +@@ -9,7 +9,7 @@ + #include <asm/orc_lookup.h> + + #define orc_warn(fmt, ...) \ +- printk_deferred_once(KERN_WARNING "WARNING: " fmt, ##__VA_ARGS__) ++ pr_warn_once("WARNING: " fmt, ##__VA_ARGS__) + + #define orc_warn_current(args...) \ + ({ \ +--- a/drivers/char/random.c ++++ b/drivers/char/random.c +@@ -1507,9 +1507,8 @@ + print_once = true; + #endif + if (__ratelimit(&unseeded_warning)) +- printk_deferred(KERN_NOTICE "random: %s called from %pS " +- "with crng_init=%d\n", func_name, caller, +- crng_init); ++ pr_notice("random: %s called from %pS with crng_init=%d\n", ++ func_name, caller, crng_init); + } + + /* +--- a/include/linux/printk.h ++++ b/include/linux/printk.h +@@ -162,21 +162,6 @@ + int _printk(const char *fmt, ...); + + /* +- * Special printk facility for scheduler/timekeeping use only, _DO_NOT_USE_ ! +- */ +-__printf(1, 2) __cold int _printk_deferred(const char *fmt, ...); +- +-extern void __printk_safe_enter(void); +-extern void __printk_safe_exit(void); +-/* +- * The printk_deferred_enter/exit macros are available only as a hack for +- * some code paths that need to defer all printk console printing. Interrupts +- * must be disabled for the deferred duration. +- */ +-#define printk_deferred_enter __printk_safe_enter +-#define printk_deferred_exit __printk_safe_exit +- +-/* + * Please don't use printk_ratelimit(), because it shares ratelimiting state + * with all other unrelated printk_ratelimit() callsites. Instead use + * printk_ratelimited() or plain old __ratelimit(). +@@ -216,19 +201,6 @@ + { + return 0; + } +-static inline __printf(1, 2) __cold +-int _printk_deferred(const char *s, ...) +-{ +- return 0; +-} +- +-static inline void printk_deferred_enter(void) +-{ +-} +- +-static inline void printk_deferred_exit(void) +-{ +-} + + static inline int printk_ratelimit(void) + { +@@ -475,8 +447,6 @@ + * See the vsnprintf() documentation for format string extensions over C99. + */ + #define printk(fmt, ...) printk_index_wrap(_printk, fmt, ##__VA_ARGS__) +-#define printk_deferred(fmt, ...) \ +- printk_index_wrap(_printk_deferred, fmt, ##__VA_ARGS__) + + /** + * pr_emerg - Print an emergency-level message +@@ -614,13 +584,9 @@ + #ifdef CONFIG_PRINTK + #define printk_once(fmt, ...) \ + DO_ONCE_LITE(printk, fmt, ##__VA_ARGS__) +-#define printk_deferred_once(fmt, ...) \ +- DO_ONCE_LITE(printk_deferred, fmt, ##__VA_ARGS__) + #else + #define printk_once(fmt, ...) \ + no_printk(fmt, ##__VA_ARGS__) +-#define printk_deferred_once(fmt, ...) \ +- no_printk(fmt, ##__VA_ARGS__) + #endif + + #define pr_emerg_once(fmt, ...) \ +--- a/include/linux/suspend.h ++++ b/include/linux/suspend.h +@@ -550,23 +550,17 @@ + #ifdef CONFIG_PM_SLEEP_DEBUG + extern bool pm_print_times_enabled; + extern bool pm_debug_messages_on; +-extern __printf(2, 3) void __pm_pr_dbg(bool defer, const char *fmt, ...); ++extern __printf(1, 2) void pm_pr_dbg(const char *fmt, ...); + #else + #define pm_print_times_enabled (false) + #define pm_debug_messages_on (false) + + #include <linux/printk.h> + +-#define __pm_pr_dbg(defer, fmt, ...) \ ++#define pm_pr_dbg(fmt, ...) \ + no_printk(KERN_DEBUG fmt, ##__VA_ARGS__) + #endif + +-#define pm_pr_dbg(fmt, ...) \ +- __pm_pr_dbg(false, fmt, ##__VA_ARGS__) +- +-#define pm_deferred_pr_dbg(fmt, ...) \ +- __pm_pr_dbg(true, fmt, ##__VA_ARGS__) +- + #ifdef CONFIG_PM_AUTOSLEEP + + /* kernel/power/autosleep.c */ +--- a/kernel/power/main.c ++++ b/kernel/power/main.c +@@ -543,14 +543,13 @@ + __setup("pm_debug_messages", pm_debug_messages_setup); + + /** +- * __pm_pr_dbg - Print a suspend debug message to the kernel log. +- * @defer: Whether or not to use printk_deferred() to print the message. ++ * pm_pr_dbg - Print a suspend debug message to the kernel log. + * @fmt: Message format. + * + * The message will be emitted if enabled through the pm_debug_messages + * sysfs attribute. + */ +-void __pm_pr_dbg(bool defer, const char *fmt, ...) ++void pm_pr_dbg(const char *fmt, ...) + { + struct va_format vaf; + va_list args; +@@ -563,10 +562,7 @@ + vaf.fmt = fmt; + vaf.va = &args; + +- if (defer) +- printk_deferred(KERN_DEBUG "PM: %pV", &vaf); +- else +- printk(KERN_DEBUG "PM: %pV", &vaf); ++ printk(KERN_DEBUG "PM: %pV", &vaf); + + va_end(args); + } +--- a/kernel/printk/Makefile ++++ b/kernel/printk/Makefile +@@ -1,6 +1,5 @@ + # SPDX-License-Identifier: GPL-2.0-only + obj-y = printk.o +-obj-$(CONFIG_PRINTK) += printk_safe.o + obj-$(CONFIG_A11Y_BRAILLE_CONSOLE) += braille.o + obj-$(CONFIG_PRINTK) += printk_ringbuffer.o + obj-$(CONFIG_PRINTK_INDEX) += index.o +--- a/kernel/printk/internal.h ++++ b/kernel/printk/internal.h +@@ -2,7 +2,6 @@ + /* + * internal.h - printk internal definitions + */ +-#include <linux/percpu.h> + + #ifdef CONFIG_PRINTK + +@@ -12,41 +11,6 @@ + LOG_CONT = 8, /* text is a fragment of a continuation line */ + }; + +-__printf(4, 0) +-int vprintk_store(int facility, int level, +- const struct dev_printk_info *dev_info, +- const char *fmt, va_list args); +- +-__printf(1, 0) int vprintk_default(const char *fmt, va_list args); +-__printf(1, 0) int vprintk_deferred(const char *fmt, va_list args); +- +-bool printk_percpu_data_ready(void); +- +-#define printk_safe_enter_irqsave(flags) \ +- do { \ +- local_irq_save(flags); \ +- __printk_safe_enter(); \ +- } while (0) +- +-#define printk_safe_exit_irqrestore(flags) \ +- do { \ +- __printk_safe_exit(); \ +- local_irq_restore(flags); \ +- } while (0) +- +-void defer_console_output(void); +- + u16 printk_parse_prefix(const char *text, int *level, + enum printk_info_flags *flags); +-#else +- +-/* +- * In !PRINTK builds we still export console_sem +- * semaphore and some of console functions (console_unlock()/etc.), so +- * printk-safe must preserve the existing local IRQ guarantees. +- */ +-#define printk_safe_enter_irqsave(flags) local_irq_save(flags) +-#define printk_safe_exit_irqrestore(flags) local_irq_restore(flags) +- +-static inline bool printk_percpu_data_ready(void) { return false; } + #endif /* CONFIG_PRINTK */ +--- a/kernel/printk/printk.c ++++ b/kernel/printk/printk.c +@@ -44,6 +44,7 @@ + #include <linux/irq_work.h> + #include <linux/ctype.h> + #include <linux/uio.h> ++#include <linux/kdb.h> + #include <linux/kgdb.h> + #include <linux/kthread.h> + #include <linux/clocksource.h> +@@ -228,19 +229,7 @@ + + static int __down_trylock_console_sem(unsigned long ip) + { +- int lock_failed; +- unsigned long flags; +- +- /* +- * Here and in __up_console_sem() we need to be in safe mode, +- * because spindump/WARN/etc from under console ->lock will +- * deadlock in printk()->down_trylock_console_sem() otherwise. +- */ +- printk_safe_enter_irqsave(flags); +- lock_failed = down_trylock(&console_sem); +- printk_safe_exit_irqrestore(flags); +- +- if (lock_failed) ++ if (down_trylock(&console_sem)) + return 1; + mutex_acquire(&console_lock_dep_map, 0, 1, ip); + return 0; +@@ -249,13 +238,9 @@ + + static void __up_console_sem(unsigned long ip) + { +- unsigned long flags; +- + mutex_release(&console_lock_dep_map, ip); + +- printk_safe_enter_irqsave(flags); + up(&console_sem); +- printk_safe_exit_irqrestore(flags); + } + #define up_console_sem() __up_console_sem(_RET_IP_) + +@@ -417,7 +402,7 @@ + */ + static bool __printk_percpu_data_ready __read_mostly; + +-bool printk_percpu_data_ready(void) ++static bool printk_percpu_data_ready(void) + { + return __printk_percpu_data_ready; + } +@@ -2023,9 +2008,9 @@ + } + + __printf(4, 0) +-int vprintk_store(int facility, int level, +- const struct dev_printk_info *dev_info, +- const char *fmt, va_list args) ++static int vprintk_store(int facility, int level, ++ const struct dev_printk_info *dev_info, ++ const char *fmt, va_list args) + { + const u32 caller_id = printk_caller_id(); + struct prb_reserved_entry e; +@@ -2171,11 +2156,28 @@ + } + EXPORT_SYMBOL(vprintk_emit); + +-int vprintk_default(const char *fmt, va_list args) ++__printf(1, 0) ++static int vprintk_default(const char *fmt, va_list args) + { + return vprintk_emit(0, LOGLEVEL_DEFAULT, NULL, fmt, args); + } +-EXPORT_SYMBOL_GPL(vprintk_default); ++ ++__printf(1, 0) ++static int vprintk_func(const char *fmt, va_list args) ++{ ++#ifdef CONFIG_KGDB_KDB ++ /* Allow to pass printk() to kdb but avoid a recursion. */ ++ if (unlikely(kdb_trap_printk && kdb_printf_cpu < 0)) ++ return vkdb_printf(KDB_MSGSRC_PRINTK, fmt, args); ++#endif ++ return vprintk_default(fmt, args); ++} ++ ++asmlinkage int vprintk(const char *fmt, va_list args) ++{ ++ return vprintk_func(fmt, args); ++} ++EXPORT_SYMBOL(vprintk); + + asmlinkage __visible int _printk(const char *fmt, ...) + { +@@ -3136,35 +3138,9 @@ + preempt_enable(); + } + +-void defer_console_output(void) +-{ +-} +- + void printk_trigger_flush(void) + { +- defer_console_output(); +-} +- +-int vprintk_deferred(const char *fmt, va_list args) +-{ +- int r; +- +- r = vprintk_emit(0, LOGLEVEL_SCHED, NULL, fmt, args); +- defer_console_output(); +- +- return r; +-} +- +-int _printk_deferred(const char *fmt, ...) +-{ +- va_list args; +- int r; +- +- va_start(args, fmt); +- r = vprintk_deferred(fmt, args); +- va_end(args); +- +- return r; ++ wake_up_klogd(); + } + + /* +--- a/kernel/printk/printk_safe.c ++++ /dev/null +@@ -1,52 +0,0 @@ +-// SPDX-License-Identifier: GPL-2.0-or-later +-/* +- * printk_safe.c - Safe printk for printk-deadlock-prone contexts +- */ +- +-#include <linux/preempt.h> +-#include <linux/kdb.h> +-#include <linux/smp.h> +-#include <linux/cpumask.h> +-#include <linux/printk.h> +-#include <linux/kprobes.h> +- +-#include "internal.h" +- +-static DEFINE_PER_CPU(int, printk_context); +- +-/* Can be preempted by NMI. */ +-void __printk_safe_enter(void) +-{ +- this_cpu_inc(printk_context); +-} +- +-/* Can be preempted by NMI. */ +-void __printk_safe_exit(void) +-{ +- this_cpu_dec(printk_context); +-} +- +-asmlinkage int vprintk(const char *fmt, va_list args) +-{ +-#ifdef CONFIG_KGDB_KDB +- /* Allow to pass printk() to kdb but avoid a recursion. */ +- if (unlikely(kdb_trap_printk && kdb_printf_cpu < 0)) +- return vkdb_printf(KDB_MSGSRC_PRINTK, fmt, args); +-#endif +- +- /* +- * Use the main logbuf even in NMI. But avoid calling console +- * drivers that might have their own locks. +- */ +- if (this_cpu_read(printk_context) || in_nmi()) { +- int len; +- +- len = vprintk_store(0, LOGLEVEL_DEFAULT, NULL, fmt, args); +- defer_console_output(); +- return len; +- } +- +- /* No obstacles. */ +- return vprintk_default(fmt, args); +-} +-EXPORT_SYMBOL(vprintk); +--- a/kernel/sched/core.c ++++ b/kernel/sched/core.c +@@ -2944,9 +2944,8 @@ + + out_set_mask: + if (printk_ratelimit()) { +- printk_deferred("Overriding affinity for process %d (%s) to CPUs %*pbl\n", +- task_pid_nr(p), p->comm, +- cpumask_pr_args(override_mask)); ++ printk("Overriding affinity for process %d (%s) to CPUs %*pbl\n", ++ task_pid_nr(p), p->comm, cpumask_pr_args(override_mask)); + } + + WARN_ON(set_cpus_allowed_ptr(p, override_mask)); +@@ -3376,8 +3375,8 @@ + * leave kernel. + */ + if (p->mm && printk_ratelimit()) { +- printk_deferred("process %d (%s) no longer affine to cpu%d\n", +- task_pid_nr(p), p->comm, cpu); ++ printk("process %d (%s) no longer affine to cpu%d\n", ++ task_pid_nr(p), p->comm, cpu); + } + } + +--- a/kernel/sched/deadline.c ++++ b/kernel/sched/deadline.c +@@ -800,7 +800,7 @@ + * entity. + */ + if (dl_time_before(dl_se->deadline, rq_clock(rq))) { +- printk_deferred_once("sched: DL replenish lagged too much\n"); ++ printk_once("sched: DL replenish lagged too much\n"); + dl_se->deadline = rq_clock(rq) + pi_of(dl_se)->dl_deadline; + dl_se->runtime = pi_of(dl_se)->dl_runtime; + } +--- a/kernel/sched/fair.c ++++ b/kernel/sched/fair.c +@@ -4237,10 +4237,7 @@ + trace_sched_stat_iowait_enabled() || + trace_sched_stat_blocked_enabled() || + trace_sched_stat_runtime_enabled()) { +- printk_deferred_once("Scheduler tracepoints stat_sleep, stat_iowait, " +- "stat_blocked and stat_runtime require the " +- "kernel parameter schedstats=enable or " +- "kernel.sched_schedstats=1\n"); ++ printk_once("Scheduler tracepoints stat_sleep, stat_iowait, stat_blocked and stat_runtime require the kernel parameter schedstats=enable or kernel.sched_schedstats=1\n"); + } + #endif + } +--- a/kernel/sched/psi.c ++++ b/kernel/sched/psi.c +@@ -710,10 +710,10 @@ + if (groupc->tasks[t]) { + groupc->tasks[t]--; + } else if (!psi_bug) { +- printk_deferred(KERN_ERR "psi: task underflow! cpu=%d t=%d tasks=[%u %u %u %u] clear=%x set=%x\n", +- cpu, t, groupc->tasks[0], +- groupc->tasks[1], groupc->tasks[2], +- groupc->tasks[3], clear, set); ++ pr_err("psi: task underflow! cpu=%d t=%d tasks=[%u %u %u %u] clear=%x set=%x\n", ++ cpu, t, groupc->tasks[0], ++ groupc->tasks[1], groupc->tasks[2], ++ groupc->tasks[3], clear, set); + psi_bug = 1; + } + } +@@ -779,9 +779,9 @@ + if (((task->psi_flags & set) || + (task->psi_flags & clear) != clear) && + !psi_bug) { +- printk_deferred(KERN_ERR "psi: inconsistent task state! task=%d:%s cpu=%d psi_flags=%x clear=%x set=%x\n", +- task->pid, task->comm, task_cpu(task), +- task->psi_flags, clear, set); ++ pr_err("psi: inconsistent task state! task=%d:%s cpu=%d psi_flags=%x clear=%x set=%x\n", ++ task->pid, task->comm, task_cpu(task), ++ task->psi_flags, clear, set); + psi_bug = 1; + } + +--- a/kernel/sched/rt.c ++++ b/kernel/sched/rt.c +@@ -977,7 +977,7 @@ + */ + if (likely(rt_b->rt_runtime)) { + rt_rq->rt_throttled = 1; +- printk_deferred_once("sched: RT throttling activated\n"); ++ printk_once("sched: RT throttling activated\n"); + } else { + /* + * In case we did anyway, make it go away, +--- a/kernel/time/clockevents.c ++++ b/kernel/time/clockevents.c +@@ -203,8 +203,7 @@ + { + /* Nothing to do if we already reached the limit */ + if (dev->min_delta_ns >= MIN_DELTA_LIMIT) { +- printk_deferred(KERN_WARNING +- "CE: Reprogramming failure. Giving up\n"); ++ pr_warn("CE: Reprogramming failure. Giving up\n"); + dev->next_event = KTIME_MAX; + return -ETIME; + } +@@ -217,10 +216,8 @@ + if (dev->min_delta_ns > MIN_DELTA_LIMIT) + dev->min_delta_ns = MIN_DELTA_LIMIT; + +- printk_deferred(KERN_WARNING +- "CE: %s increased min_delta_ns to %llu nsec\n", +- dev->name ? dev->name : "?", +- (unsigned long long) dev->min_delta_ns); ++ pr_warn("CE: %s increased min_delta_ns to %llu nsec\n", ++ dev->name ? dev->name : "?", (unsigned long long) dev->min_delta_ns); + return 0; + } + +--- a/kernel/time/ntp.c ++++ b/kernel/time/ntp.c +@@ -939,9 +939,7 @@ + time_status |= STA_PPSERROR; + pps_errcnt++; + pps_dec_freq_interval(); +- printk_deferred(KERN_ERR +- "hardpps: PPSERROR: interval too long - %lld s\n", +- freq_norm.sec); ++ pr_err("hardpps: PPSERROR: interval too long - %lld s\n", freq_norm.sec); + return 0; + } + +@@ -954,8 +952,7 @@ + delta = shift_right(ftemp - pps_freq, NTP_SCALE_SHIFT); + pps_freq = ftemp; + if (delta > PPS_MAXWANDER || delta < -PPS_MAXWANDER) { +- printk_deferred(KERN_WARNING +- "hardpps: PPSWANDER: change=%ld\n", delta); ++ pr_warn("hardpps: PPSWANDER: change=%ld\n", delta); + time_status |= STA_PPSWANDER; + pps_stbcnt++; + pps_dec_freq_interval(); +@@ -999,9 +996,8 @@ + * the time offset is updated. + */ + if (jitter > (pps_jitter << PPS_POPCORN)) { +- printk_deferred(KERN_WARNING +- "hardpps: PPSJITTER: jitter=%ld, limit=%ld\n", +- jitter, (pps_jitter << PPS_POPCORN)); ++ pr_warn("hardpps: PPSJITTER: jitter=%ld, limit=%ld\n", ++ jitter, (pps_jitter << PPS_POPCORN)); + time_status |= STA_PPSJITTER; + pps_jitcnt++; + } else if (time_status & STA_PPSTIME) { +@@ -1058,7 +1054,7 @@ + time_status |= STA_PPSJITTER; + /* restart the frequency calibration interval */ + pps_fbase = *raw_ts; +- printk_deferred(KERN_ERR "hardpps: PPSJITTER: bad pulse\n"); ++ pr_err("hardpps: PPSJITTER: bad pulse\n"); + return; + } + +--- a/kernel/time/timekeeping.c ++++ b/kernel/time/timekeeping.c +@@ -203,22 +203,23 @@ + const char *name = tk->tkr_mono.clock->name; + + if (offset > max_cycles) { +- printk_deferred("WARNING: timekeeping: Cycle offset (%lld) is larger than allowed by the '%s' clock's max_cycles value (%lld): time overflow danger\n", +- offset, name, max_cycles); +- printk_deferred(" timekeeping: Your kernel is sick, but tries to cope by capping time updates\n"); ++ printk("WARNING: timekeeping: Cycle offset (%lld) is larger than allowed by the '%s' clock's max_cycles value (%lld): time overflow danger\n", ++ offset, name, max_cycles); ++ printk(" timekeeping: Your kernel is sick, but tries to cope by capping time updates\n"); + } else { + if (offset > (max_cycles >> 1)) { +- printk_deferred("INFO: timekeeping: Cycle offset (%lld) is larger than the '%s' clock's 50%% safety margin (%lld)\n", +- offset, name, max_cycles >> 1); +- printk_deferred(" timekeeping: Your kernel is still fine, but is feeling a bit nervous\n"); ++ printk("INFO: timekeeping: Cycle offset (%lld) is larger than the '%s' clock's 50%% safety margin (%lld)\n", ++ offset, name, max_cycles >> 1); ++ printk(" timekeeping: Your kernel is still fine, but is feeling a bit nervous\n"); + } + } + + if (tk->underflow_seen) { + if (jiffies - tk->last_warning > WARNING_FREQ) { +- printk_deferred("WARNING: Underflow in clocksource '%s' observed, time update ignored.\n", name); +- printk_deferred(" Please report this, consider using a different clocksource, if possible.\n"); +- printk_deferred(" Your kernel is probably still fine.\n"); ++ printk("WARNING: Underflow in clocksource '%s' observed, time update ignored.\n", ++ name); ++ printk(" Please report this, consider using a different clocksource, if possible.\n"); ++ printk(" Your kernel is probably still fine.\n"); + tk->last_warning = jiffies; + } + tk->underflow_seen = 0; +@@ -226,9 +227,10 @@ + + if (tk->overflow_seen) { + if (jiffies - tk->last_warning > WARNING_FREQ) { +- printk_deferred("WARNING: Overflow in clocksource '%s' observed, time update capped.\n", name); +- printk_deferred(" Please report this, consider using a different clocksource, if possible.\n"); +- printk_deferred(" Your kernel is probably still fine.\n"); ++ printk("WARNING: Overflow in clocksource '%s' observed, time update capped.\n", ++ name); ++ printk(" Please report this, consider using a different clocksource, if possible.\n"); ++ printk(" Your kernel is probably still fine.\n"); + tk->last_warning = jiffies; + } + tk->overflow_seen = 0; +@@ -1669,9 +1671,7 @@ + const struct timespec64 *delta) + { + if (!timespec64_valid_strict(delta)) { +- printk_deferred(KERN_WARNING +- "__timekeeping_inject_sleeptime: Invalid " +- "sleep delta value!\n"); ++ pr_warn("%s: Invalid sleep delta value!\n", __func__); + return; + } + tk_xtime_add(tk, delta); +--- a/kernel/time/timekeeping_debug.c ++++ b/kernel/time/timekeeping_debug.c +@@ -49,7 +49,7 @@ + int bin = min(fls(t->tv_sec), NUM_BINS-1); + + sleep_time_bin[bin]++; +- pm_deferred_pr_dbg("Timekeeping suspended for %lld.%03lu seconds\n", ++ pm_pr_dbg("Timekeeping suspended for %lld.%03lu seconds\n", + (s64)t->tv_sec, t->tv_nsec / NSEC_PER_MSEC); + } + +--- a/kernel/workqueue.c ++++ b/kernel/workqueue.c +@@ -4836,9 +4836,7 @@ + * drivers that queue work while holding locks + * also taken in their write paths. + */ +- printk_deferred_enter(); + show_pwq(pwq); +- printk_deferred_exit(); + } + raw_spin_unlock_irqrestore(&pwq->pool->lock, flags); + /* +@@ -4862,7 +4860,6 @@ + * queue work while holding locks also taken in their write + * paths. + */ +- printk_deferred_enter(); + pr_info("pool %d:", pool->id); + pr_cont_pool_info(pool); + pr_cont(" hung=%us workers=%d", +@@ -4877,7 +4874,6 @@ + first = false; + } + pr_cont("\n"); +- printk_deferred_exit(); + next_pool: + raw_spin_unlock_irqrestore(&pool->lock, flags); + /* +--- a/lib/ratelimit.c ++++ b/lib/ratelimit.c +@@ -47,9 +47,7 @@ + if (time_is_before_jiffies(rs->begin + rs->interval)) { + if (rs->missed) { + if (!(rs->flags & RATELIMIT_MSG_ON_RELEASE)) { +- printk_deferred(KERN_WARNING +- "%s: %d callbacks suppressed\n", +- func, rs->missed); ++ pr_warn("%s: %d callbacks suppressed\n", func, rs->missed); + rs->missed = 0; + } + } diff --git a/debian/patches-rt/printk__rename_printk_cpulock_API_and_always_disable_interrupts.patch b/debian/patches-rt/printk__rename_printk_cpulock_API_and_always_disable_interrupts.patch new file mode 100644 index 000000000..f59e5eefb --- /dev/null +++ b/debian/patches-rt/printk__rename_printk_cpulock_API_and_always_disable_interrupts.patch @@ -0,0 +1,118 @@ +Subject: printk: rename printk cpulock API and always disable interrupts +From: John Ogness <john.ogness@linutronix.de> +Date: Thu Jul 15 09:34:45 2021 +0206 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +From: John Ogness <john.ogness@linutronix.de> + +The printk cpulock functions use local_irq_disable(). This means that +hardware interrupts are also disabled on PREEMPT_RT. To make this +clear, rename the functions to use the raw_ prefix: + +raw_printk_cpu_lock_irqsave(flags); +raw_printk_cpu_unlock_irqrestore(flags); + +Also, these functions were a NOP for !CONFIG_SMP. But for !CONFIG_SMP +they still need to disable hardware interrupts. So modify them +appropriately for this. + +Signed-off-by: John Ogness <john.ogness@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + +--- + include/linux/printk.h | 30 ++++++++++++++---------------- + lib/dump_stack.c | 4 ++-- + lib/nmi_backtrace.c | 4 ++-- + 3 files changed, 18 insertions(+), 20 deletions(-) +--- +--- a/include/linux/printk.h ++++ b/include/linux/printk.h +@@ -280,17 +280,22 @@ static inline void dump_stack(void) + extern int __printk_cpu_trylock(void); + extern void __printk_wait_on_cpu_lock(void); + extern void __printk_cpu_unlock(void); ++#else ++#define __printk_cpu_trylock() 1 ++#define __printk_wait_on_cpu_lock() ++#define __printk_cpu_unlock() ++#endif /* CONFIG_SMP */ + + /** +- * printk_cpu_lock_irqsave() - Acquire the printk cpu-reentrant spinning +- * lock and disable interrupts. ++ * raw_printk_cpu_lock_irqsave() - Acquire the printk cpu-reentrant spinning ++ * lock and disable interrupts. + * @flags: Stack-allocated storage for saving local interrupt state, +- * to be passed to printk_cpu_unlock_irqrestore(). ++ * to be passed to raw_printk_cpu_unlock_irqrestore(). + * + * If the lock is owned by another CPU, spin until it becomes available. + * Interrupts are restored while spinning. + */ +-#define printk_cpu_lock_irqsave(flags) \ ++#define raw_printk_cpu_lock_irqsave(flags) \ + for (;;) { \ + local_irq_save(flags); \ + if (__printk_cpu_trylock()) \ +@@ -300,22 +305,15 @@ extern void __printk_cpu_unlock(void); + } + + /** +- * printk_cpu_unlock_irqrestore() - Release the printk cpu-reentrant spinning +- * lock and restore interrupts. +- * @flags: Caller's saved interrupt state, from printk_cpu_lock_irqsave(). ++ * raw_printk_cpu_unlock_irqrestore() - Release the printk cpu-reentrant ++ * spinning lock and restore interrupts. ++ * @flags: Caller's saved interrupt state from raw_printk_cpu_lock_irqsave(). + */ +-#define printk_cpu_unlock_irqrestore(flags) \ ++#define raw_printk_cpu_unlock_irqrestore(flags) \ + do { \ + __printk_cpu_unlock(); \ + local_irq_restore(flags); \ +- } while (0) \ +- +-#else +- +-#define printk_cpu_lock_irqsave(flags) ((void)flags) +-#define printk_cpu_unlock_irqrestore(flags) ((void)flags) +- +-#endif /* CONFIG_SMP */ ++ } while (0) + + extern int kptr_restrict; + +--- a/lib/dump_stack.c ++++ b/lib/dump_stack.c +@@ -102,9 +102,9 @@ asmlinkage __visible void dump_stack_lvl + * Permit this cpu to perform nested stack dumps while serialising + * against other CPUs + */ +- printk_cpu_lock_irqsave(flags); ++ raw_printk_cpu_lock_irqsave(flags); + __dump_stack(log_lvl); +- printk_cpu_unlock_irqrestore(flags); ++ raw_printk_cpu_unlock_irqrestore(flags); + } + EXPORT_SYMBOL(dump_stack_lvl); + +--- a/lib/nmi_backtrace.c ++++ b/lib/nmi_backtrace.c +@@ -93,7 +93,7 @@ bool nmi_cpu_backtrace(struct pt_regs *r + * Allow nested NMI backtraces while serializing + * against other CPUs. + */ +- printk_cpu_lock_irqsave(flags); ++ raw_printk_cpu_lock_irqsave(flags); + if (!READ_ONCE(backtrace_idle) && regs && cpu_in_idle(instruction_pointer(regs))) { + pr_warn("NMI backtrace for cpu %d skipped: idling at %pS\n", + cpu, (void *)instruction_pointer(regs)); +@@ -104,7 +104,7 @@ bool nmi_cpu_backtrace(struct pt_regs *r + else + dump_stack(); + } +- printk_cpu_unlock_irqrestore(flags); ++ raw_printk_cpu_unlock_irqrestore(flags); + cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask)); + return true; + } diff --git a/debian/patches-rt/printk__use_seqcount_latch_for_console_seq.patch b/debian/patches-rt/printk__use_seqcount_latch_for_console_seq.patch new file mode 100644 index 000000000..708c7fcce --- /dev/null +++ b/debian/patches-rt/printk__use_seqcount_latch_for_console_seq.patch @@ -0,0 +1,187 @@ +Subject: printk: use seqcount_latch for console_seq +From: John Ogness <john.ogness@linutronix.de> +Date: Mon Nov 30 01:42:05 2020 +0106 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +From: John Ogness <john.ogness@linutronix.de> + +In preparation for atomic printing, change @console_seq to use +seqcount_latch so that it can be read without requiring @console_sem. + +Signed-off-by: John Ogness <john.ogness@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + +--- + kernel/printk/printk.c | 73 +++++++++++++++++++++++++++---------------------- + 1 file changed, 41 insertions(+), 32 deletions(-) +--- +--- a/kernel/printk/printk.c ++++ b/kernel/printk/printk.c +@@ -362,9 +362,7 @@ static u64 syslog_seq; + static size_t syslog_partial; + static bool syslog_time; + +-/* All 3 protected by @console_sem. */ +-/* the next printk record to write to the console */ +-static u64 console_seq; ++/* Both protected by @console_sem. */ + static u64 exclusive_console_stop_seq; + static unsigned long console_dropped; + +@@ -374,6 +372,17 @@ struct latched_seq { + }; + + /* ++ * The next printk record to write to the console. There are two ++ * copies (updated with seqcount_latch) so that reads can locklessly ++ * access a valid value. Writers are synchronized by @console_sem. ++ */ ++static struct latched_seq console_seq = { ++ .latch = SEQCNT_LATCH_ZERO(console_seq.latch), ++ .val[0] = 0, ++ .val[1] = 0, ++}; ++ ++/* + * The next printk record to read after the last 'clear' command. There are + * two copies (updated with seqcount_latch) so that reads can locklessly + * access a valid value. Writers are synchronized by @syslog_lock. +@@ -436,7 +445,7 @@ bool printk_percpu_data_ready(void) + return __printk_percpu_data_ready; + } + +-/* Must be called under syslog_lock. */ ++/* Must be called under associated write-protection lock. */ + static void latched_seq_write(struct latched_seq *ls, u64 val) + { + raw_write_seqcount_latch(&ls->latch); +@@ -2278,9 +2287,9 @@ EXPORT_SYMBOL(_printk); + + #define prb_read_valid(rb, seq, r) false + #define prb_first_valid_seq(rb) 0 ++#define latched_seq_read_nolock(seq) 0 ++#define latched_seq_write(dst, src) + +-static u64 syslog_seq; +-static u64 console_seq; + static u64 exclusive_console_stop_seq; + static unsigned long console_dropped; + +@@ -2608,7 +2617,7 @@ void console_unlock(void) + bool do_cond_resched, retry; + struct printk_info info; + struct printk_record r; +- u64 __maybe_unused next_seq; ++ u64 seq; + + if (console_suspended) { + up_console_sem(); +@@ -2652,12 +2661,14 @@ void console_unlock(void) + size_t len; + + skip: +- if (!prb_read_valid(prb, console_seq, &r)) ++ seq = latched_seq_read_nolock(&console_seq); ++ if (!prb_read_valid(prb, seq, &r)) + break; + +- if (console_seq != r.info->seq) { +- console_dropped += r.info->seq - console_seq; +- console_seq = r.info->seq; ++ if (seq != r.info->seq) { ++ console_dropped += r.info->seq - seq; ++ latched_seq_write(&console_seq, r.info->seq); ++ seq = r.info->seq; + } + + if (suppress_message_printing(r.info->level)) { +@@ -2666,13 +2677,13 @@ void console_unlock(void) + * directly to the console when we received it, and + * record that has level above the console loglevel. + */ +- console_seq++; ++ latched_seq_write(&console_seq, seq + 1); + goto skip; + } + + /* Output to all consoles once old messages replayed. */ + if (unlikely(exclusive_console && +- console_seq >= exclusive_console_stop_seq)) { ++ seq >= exclusive_console_stop_seq)) { + exclusive_console = NULL; + } + +@@ -2693,7 +2704,7 @@ void console_unlock(void) + len = record_print_text(&r, + console_msg_format & MSG_FORMAT_SYSLOG, + printk_time); +- console_seq++; ++ latched_seq_write(&console_seq, seq + 1); + + /* + * While actively printing out messages, if another printk() +@@ -2721,9 +2732,6 @@ void console_unlock(void) + cond_resched(); + } + +- /* Get consistent value of the next-to-be-used sequence number. */ +- next_seq = console_seq; +- + console_locked = 0; + up_console_sem(); + +@@ -2733,7 +2741,7 @@ void console_unlock(void) + * there's a new owner and the console_unlock() from them will do the + * flush, no worries. + */ +- retry = prb_read_valid(prb, next_seq, NULL); ++ retry = prb_read_valid(prb, latched_seq_read_nolock(&console_seq), NULL); + if (retry && console_trylock()) + goto again; + } +@@ -2785,18 +2793,19 @@ void console_unblank(void) + */ + void console_flush_on_panic(enum con_flush_mode mode) + { +- /* +- * If someone else is holding the console lock, trylock will fail +- * and may_schedule may be set. Ignore and proceed to unlock so +- * that messages are flushed out. As this can be called from any +- * context and we don't want to get preempted while flushing, +- * ensure may_schedule is cleared. +- */ +- console_trylock(); +- console_may_schedule = 0; +- +- if (mode == CONSOLE_REPLAY_ALL) +- console_seq = prb_first_valid_seq(prb); ++ if (console_trylock()) { ++ if (mode == CONSOLE_REPLAY_ALL) ++ latched_seq_write(&console_seq, prb_first_valid_seq(prb)); ++ } else { ++ /* ++ * Another context is holding the console lock and ++ * @console_may_schedule may be set. Ignore and proceed to ++ * unlock so that messages are flushed out. As this can be ++ * called from any context and we don't want to get preempted ++ * while flushing, ensure @console_may_schedule is cleared. ++ */ ++ console_may_schedule = 0; ++ } + console_unlock(); + } + +@@ -3032,11 +3041,11 @@ void register_console(struct console *ne + * ignores console_lock. + */ + exclusive_console = newcon; +- exclusive_console_stop_seq = console_seq; ++ exclusive_console_stop_seq = latched_seq_read_nolock(&console_seq); + + /* Get a consistent copy of @syslog_seq. */ + mutex_lock(&syslog_lock); +- console_seq = syslog_seq; ++ latched_seq_write(&console_seq, syslog_seq); + mutex_unlock(&syslog_lock); + } + console_unlock(); diff --git a/debian/patches-rt/ptrace__fix_ptrace_vs_tasklist_lock_race.patch b/debian/patches-rt/ptrace__fix_ptrace_vs_tasklist_lock_race.patch new file mode 100644 index 000000000..119853b11 --- /dev/null +++ b/debian/patches-rt/ptrace__fix_ptrace_vs_tasklist_lock_race.patch @@ -0,0 +1,216 @@ +Subject: ptrace: fix ptrace vs tasklist_lock race +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Thu Aug 29 18:21:04 2013 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> + +As explained by Alexander Fyodorov <halcy@yandex.ru>: + +|read_lock(&tasklist_lock) in ptrace_stop() is converted to mutex on RT kernel, +|and it can remove __TASK_TRACED from task->state (by moving it to +|task->saved_state). If parent does wait() on child followed by a sys_ptrace +|call, the following race can happen: +| +|- child sets __TASK_TRACED in ptrace_stop() +|- parent does wait() which eventually calls wait_task_stopped() and returns +| child's pid +|- child blocks on read_lock(&tasklist_lock) in ptrace_stop() and moves +| __TASK_TRACED flag to saved_state +|- parent calls sys_ptrace, which calls ptrace_check_attach() and wait_task_inactive() + +The patch is based on his initial patch where an additional check is +added in case the __TASK_TRACED moved to ->saved_state. The pi_lock is +taken in case the caller is interrupted between looking into ->state and +->saved_state. + +[ Fix for ptrace_unfreeze_traced() by Oleg Nesterov ] +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + +--- + include/linux/sched.h | 79 +++++++++++++++++++++++++++++++++++++++++++++++--- + kernel/ptrace.c | 38 +++++++++++++++++++----- + kernel/sched/core.c | 4 +- + 3 files changed, 108 insertions(+), 13 deletions(-) +--- +--- a/include/linux/sched.h ++++ b/include/linux/sched.h +@@ -118,12 +118,8 @@ struct task_group; + + #define task_is_running(task) (READ_ONCE((task)->__state) == TASK_RUNNING) + +-#define task_is_traced(task) ((READ_ONCE(task->__state) & __TASK_TRACED) != 0) +- + #define task_is_stopped(task) ((READ_ONCE(task->__state) & __TASK_STOPPED) != 0) + +-#define task_is_stopped_or_traced(task) ((READ_ONCE(task->__state) & (__TASK_STOPPED | __TASK_TRACED)) != 0) +- + /* + * Special states are those that do not use the normal wait-loop pattern. See + * the comment with set_special_state(). +@@ -2015,6 +2011,81 @@ static inline int test_tsk_need_resched( + return unlikely(test_tsk_thread_flag(tsk,TIF_NEED_RESCHED)); + } + ++#ifdef CONFIG_PREEMPT_RT ++static inline bool task_match_saved_state(struct task_struct *p, long match_state) ++{ ++ return p->saved_state == match_state; ++} ++ ++static inline bool task_is_traced(struct task_struct *task) ++{ ++ bool traced = false; ++ ++ /* in case the task is sleeping on tasklist_lock */ ++ raw_spin_lock_irq(&task->pi_lock); ++ if (READ_ONCE(task->__state) & __TASK_TRACED) ++ traced = true; ++ else if (task->saved_state & __TASK_TRACED) ++ traced = true; ++ raw_spin_unlock_irq(&task->pi_lock); ++ return traced; ++} ++ ++static inline bool task_is_stopped_or_traced(struct task_struct *task) ++{ ++ bool traced_stopped = false; ++ unsigned long flags; ++ ++ raw_spin_lock_irqsave(&task->pi_lock, flags); ++ ++ if (READ_ONCE(task->__state) & (__TASK_STOPPED | __TASK_TRACED)) ++ traced_stopped = true; ++ else if (task->saved_state & (__TASK_STOPPED | __TASK_TRACED)) ++ traced_stopped = true; ++ ++ raw_spin_unlock_irqrestore(&task->pi_lock, flags); ++ return traced_stopped; ++} ++ ++#else ++ ++static inline bool task_match_saved_state(struct task_struct *p, long match_state) ++{ ++ return false; ++} ++ ++static inline bool task_is_traced(struct task_struct *task) ++{ ++ return READ_ONCE(task->__state) & __TASK_TRACED; ++} ++ ++static inline bool task_is_stopped_or_traced(struct task_struct *task) ++{ ++ return READ_ONCE(task->__state) & (__TASK_STOPPED | __TASK_TRACED); ++} ++#endif ++ ++static inline bool task_match_state_or_saved(struct task_struct *p, ++ long match_state) ++{ ++ if (READ_ONCE(p->__state) == match_state) ++ return true; ++ ++ return task_match_saved_state(p, match_state); ++} ++ ++static inline bool task_match_state_lock(struct task_struct *p, ++ long match_state) ++{ ++ bool match; ++ ++ raw_spin_lock_irq(&p->pi_lock); ++ match = task_match_state_or_saved(p, match_state); ++ raw_spin_unlock_irq(&p->pi_lock); ++ ++ return match; ++} ++ + /* + * cond_resched() and cond_resched_lock(): latency reduction via + * explicit rescheduling in places that are safe. The return +--- a/kernel/ptrace.c ++++ b/kernel/ptrace.c +@@ -197,7 +197,18 @@ static bool ptrace_freeze_traced(struct + spin_lock_irq(&task->sighand->siglock); + if (task_is_traced(task) && !looks_like_a_spurious_pid(task) && + !__fatal_signal_pending(task)) { ++#ifdef CONFIG_PREEMPT_RT ++ unsigned long flags; ++ ++ raw_spin_lock_irqsave(&task->pi_lock, flags); ++ if (READ_ONCE(task->__state) & __TASK_TRACED) ++ WRITE_ONCE(task->__state, __TASK_TRACED); ++ else ++ task->saved_state = __TASK_TRACED; ++ raw_spin_unlock_irqrestore(&task->pi_lock, flags); ++#else + WRITE_ONCE(task->__state, __TASK_TRACED); ++#endif + ret = true; + } + spin_unlock_irq(&task->sighand->siglock); +@@ -207,7 +218,11 @@ static bool ptrace_freeze_traced(struct + + static void ptrace_unfreeze_traced(struct task_struct *task) + { +- if (READ_ONCE(task->__state) != __TASK_TRACED) ++ unsigned long flags; ++ bool frozen = true; ++ ++ if (!IS_ENABLED(CONFIG_PREEMPT_RT) && ++ READ_ONCE(task->__state) != __TASK_TRACED) + return; + + WARN_ON(!task->ptrace || task->parent != current); +@@ -217,12 +232,21 @@ static void ptrace_unfreeze_traced(struc + * Recheck state under the lock to close this race. + */ + spin_lock_irq(&task->sighand->siglock); +- if (READ_ONCE(task->__state) == __TASK_TRACED) { +- if (__fatal_signal_pending(task)) +- wake_up_state(task, __TASK_TRACED); +- else +- WRITE_ONCE(task->__state, TASK_TRACED); +- } ++ raw_spin_lock_irqsave(&task->pi_lock, flags); ++ if (READ_ONCE(task->__state) == __TASK_TRACED) ++ WRITE_ONCE(task->__state, TASK_TRACED); ++ ++#ifdef CONFIG_PREEMPT_RT ++ else if (task->saved_state == __TASK_TRACED) ++ task->saved_state = TASK_TRACED; ++#endif ++ else ++ frozen = false; ++ raw_spin_unlock_irqrestore(&task->pi_lock, flags); ++ ++ if (frozen && __fatal_signal_pending(task)) ++ wake_up_state(task, __TASK_TRACED); ++ + spin_unlock_irq(&task->sighand->siglock); + } + +--- a/kernel/sched/core.c ++++ b/kernel/sched/core.c +@@ -3207,7 +3207,7 @@ unsigned long wait_task_inactive(struct + * is actually now running somewhere else! + */ + while (task_running(rq, p)) { +- if (match_state && unlikely(READ_ONCE(p->__state) != match_state)) ++ if (match_state && !task_match_state_lock(p, match_state)) + return 0; + cpu_relax(); + } +@@ -3222,7 +3222,7 @@ unsigned long wait_task_inactive(struct + running = task_running(rq, p); + queued = task_on_rq_queued(p); + ncsw = 0; +- if (!match_state || READ_ONCE(p->__state) == match_state) ++ if (!match_state || task_match_state_or_saved(p, match_state)) + ncsw = p->nvcsw | LONG_MIN; /* sets MSB */ + task_rq_unlock(rq, p, &rf); + diff --git a/debian/patches-rt/0251-random-Make-it-work-on-rt.patch b/debian/patches-rt/random__Make_it_work_on_rt.patch index 0bdf570b1..043a43da2 100644 --- a/debian/patches-rt/0251-random-Make-it-work-on-rt.patch +++ b/debian/patches-rt/random__Make_it_work_on_rt.patch @@ -1,8 +1,9 @@ -From 12d38fa74d9f66e299f4bfadf24f2cb1908e5812 Mon Sep 17 00:00:00 2001 +Subject: random: Make it work on rt +From: Thomas Gleixner <tglx@linutronix.de> +Date: Tue Aug 21 20:38:50 2012 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 21 Aug 2012 20:38:50 +0200 -Subject: [PATCH 251/296] random: Make it work on rt -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz Delegate the random insertion to the forced threaded interrupt handler. Store the return IP of the hard interrupt handler in the irq @@ -10,22 +11,23 @@ descriptor and feed it into the random generator as a source of entropy. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - arch/x86/kernel/cpu/mshyperv.c | 3 ++- - drivers/char/random.c | 11 +++++------ - drivers/hv/hyperv_vmbus.h | 1 + - drivers/hv/vmbus_drv.c | 5 ++++- - include/linux/irqdesc.h | 1 + - include/linux/random.h | 2 +- - kernel/irq/handle.c | 8 +++++++- - kernel/irq/manage.c | 6 ++++++ - 8 files changed, 27 insertions(+), 10 deletions(-) -diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c -index 6cc50ab07bde..4055fe3ae869 100644 + + +--- + arch/x86/kernel/cpu/mshyperv.c | 3 ++- + drivers/char/random.c | 11 +++++------ + drivers/hv/hyperv_vmbus.h | 1 + + drivers/hv/vmbus_drv.c | 5 ++++- + include/linux/irqdesc.h | 1 + + include/linux/random.h | 2 +- + kernel/irq/handle.c | 10 ++++++++-- + kernel/irq/manage.c | 6 ++++++ + 8 files changed, 28 insertions(+), 11 deletions(-) +--- --- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c -@@ -80,11 +80,12 @@ EXPORT_SYMBOL_GPL(hv_remove_vmbus_irq); +@@ -75,11 +75,12 @@ void hv_remove_vmbus_handler(void) DEFINE_IDTENTRY_SYSVEC(sysvec_hyperv_stimer0) { struct pt_regs *old_regs = set_irq_regs(regs); @@ -39,11 +41,9 @@ index 6cc50ab07bde..4055fe3ae869 100644 ack_APIC_irq(); set_irq_regs(old_regs); -diff --git a/drivers/char/random.c b/drivers/char/random.c -index f462b9d2f5a5..a5370228c17f 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c -@@ -1252,28 +1252,27 @@ static __u32 get_reg(struct fast_pool *f, struct pt_regs *regs) +@@ -1242,26 +1242,25 @@ static __u32 get_reg(struct fast_pool *f return *ptr; } @@ -57,8 +57,6 @@ index f462b9d2f5a5..a5370228c17f 100644 cycles_t cycles = random_get_entropy(); __u32 c_high, j_high; - __u64 ip; - unsigned long seed; - int credit = 0; if (cycles == 0) - cycles = get_reg(fast_pool, regs); @@ -77,11 +75,9 @@ index f462b9d2f5a5..a5370228c17f 100644 fast_mix(fast_pool); add_interrupt_bench(cycles); -diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h -index 40e2b9f91163..d9de4813ffac 100644 --- a/drivers/hv/hyperv_vmbus.h +++ b/drivers/hv/hyperv_vmbus.h -@@ -18,6 +18,7 @@ +@@ -19,6 +19,7 @@ #include <linux/atomic.h> #include <linux/hyperv.h> #include <linux/interrupt.h> @@ -89,8 +85,6 @@ index 40e2b9f91163..d9de4813ffac 100644 #include "hv_trace.h" -diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c -index d5cc74ecc582..d0763ea99ded 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -22,6 +22,7 @@ @@ -101,7 +95,7 @@ index d5cc74ecc582..d0763ea99ded 100644 #include <linux/delay.h> #include <linux/notifier.h> -@@ -1307,6 +1308,8 @@ static void vmbus_isr(void) +@@ -1337,6 +1338,8 @@ static void vmbus_isr(void) void *page_addr = hv_cpu->synic_event_page; struct hv_message *msg; union hv_synic_event_flags *event; @@ -110,17 +104,15 @@ index d5cc74ecc582..d0763ea99ded 100644 bool handled = false; if (unlikely(page_addr == NULL)) -@@ -1351,7 +1354,7 @@ static void vmbus_isr(void) +@@ -1381,7 +1384,7 @@ static void vmbus_isr(void) tasklet_schedule(&hv_cpu->msg_dpc); } -- add_interrupt_randomness(hv_get_vector(), 0); -+ add_interrupt_randomness(hv_get_vector(), 0, ip); +- add_interrupt_randomness(vmbus_interrupt, 0); ++ add_interrupt_randomness(vmbus_interrupt, 0, ip); } - /* -diff --git a/include/linux/irqdesc.h b/include/linux/irqdesc.h -index 5745491303e0..2b9caf39fb07 100644 + static irqreturn_t vmbus_percpu_isr(int irq, void *dev_id) --- a/include/linux/irqdesc.h +++ b/include/linux/irqdesc.h @@ -68,6 +68,7 @@ struct irq_desc { @@ -131,11 +123,9 @@ index 5745491303e0..2b9caf39fb07 100644 raw_spinlock_t lock; struct cpumask *percpu_enabled; const struct cpumask *percpu_affinity; -diff --git a/include/linux/random.h b/include/linux/random.h -index f45b8be3e3c4..0e41d0527809 100644 --- a/include/linux/random.h +++ b/include/linux/random.h -@@ -35,7 +35,7 @@ static inline void add_latent_entropy(void) {} +@@ -35,7 +35,7 @@ static inline void add_latent_entropy(vo extern void add_input_randomness(unsigned int type, unsigned int code, unsigned int value) __latent_entropy; @@ -144,16 +134,17 @@ index f45b8be3e3c4..0e41d0527809 100644 extern void get_random_bytes(void *buf, int nbytes); extern int wait_for_random_bytes(void); -diff --git a/kernel/irq/handle.c b/kernel/irq/handle.c -index 762a928e18f9..7929fcdb7817 100644 --- a/kernel/irq/handle.c +++ b/kernel/irq/handle.c -@@ -192,10 +192,16 @@ irqreturn_t handle_irq_event_percpu(struct irq_desc *desc) +@@ -190,12 +190,18 @@ irqreturn_t __handle_irq_event_percpu(st + + irqreturn_t handle_irq_event_percpu(struct irq_desc *desc) { - irqreturn_t retval; - unsigned int flags = 0; +- irqreturn_t retval; + struct pt_regs *regs = get_irq_regs(); + u64 ip = regs ? instruction_pointer(regs) : 0; + unsigned int flags = 0; ++ irqreturn_t retval; retval = __handle_irq_event_percpu(desc, &flags); @@ -164,13 +155,11 @@ index 762a928e18f9..7929fcdb7817 100644 + add_interrupt_randomness(desc->irq_data.irq, flags, ip); +#endif - if (!noirqdebug) + if (!irq_settings_no_debug(desc)) note_interrupt(desc, retval); -diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c -index 0558f75c0b85..8857ffa6e0e1 100644 --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c -@@ -1181,6 +1181,12 @@ static int irq_thread(void *data) +@@ -1281,6 +1281,12 @@ static int irq_thread(void *data) if (action_ret == IRQ_WAKE_THREAD) irq_wake_secondary(desc, action); @@ -183,6 +172,3 @@ index 0558f75c0b85..8857ffa6e0e1 100644 wake_threads_waitq(desc); } --- -2.30.2 - diff --git a/debian/patches-rt/rcu-tree-Protect-rcu_rdp_is_offloaded-invocations-on.patch b/debian/patches-rt/rcu-tree-Protect-rcu_rdp_is_offloaded-invocations-on.patch new file mode 100644 index 000000000..f8c87c567 --- /dev/null +++ b/debian/patches-rt/rcu-tree-Protect-rcu_rdp_is_offloaded-invocations-on.patch @@ -0,0 +1,84 @@ +From: Thomas Gleixner <tglx@linutronix.de> +Date: Tue, 21 Sep 2021 23:12:50 +0200 +Subject: [PATCH] rcu/tree: Protect rcu_rdp_is_offloaded() invocations on RT +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +Valentin reported warnings about suspicious RCU usage on RT kernels. Those +happen when offloading of RCU callbacks is enabled: + + WARNING: suspicious RCU usage + 5.13.0-rt1 #20 Not tainted + ----------------------------- + kernel/rcu/tree_plugin.h:69 Unsafe read of RCU_NOCB offloaded state! + + rcu_rdp_is_offloaded (kernel/rcu/tree_plugin.h:69 kernel/rcu/tree_plugin.h:58) + rcu_core (kernel/rcu/tree.c:2332 kernel/rcu/tree.c:2398 kernel/rcu/tree.c:2777) + rcu_cpu_kthread (./include/linux/bottom_half.h:32 kernel/rcu/tree.c:2876) + +The reason is that rcu_rdp_is_offloaded() is invoked without one of the +required protections on RT enabled kernels because local_bh_disable() does +not disable preemption on RT. + +Valentin proposed to add a local lock to the code in question, but that's +suboptimal in several aspects: + + 1) local locks add extra code to !RT kernels for no value. + + 2) All possible callsites have to audited and amended when affected + possible at an outer function level due to lock nesting issues. + + 3) As the local lock has to be taken at the outer functions it's required + to release and reacquire them in the inner code sections which might + voluntary schedule, e.g. rcu_do_batch(). + +Both callsites of rcu_rdp_is_offloaded() which trigger this check invoke +rcu_rdp_is_offloaded() in the variable declaration section right at the top +of the functions. But the actual usage of the result is either within a +section which provides the required protections or after such a section. + +So the obvious solution is to move the invocation into the code sections +which provide the proper protections, which solves the problem for RT and +does not have any impact on !RT kernels. + +Reported-by: Valentin Schneider <valentin.schneider@arm.com> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + kernel/rcu/tree.c | 7 ++++--- + 1 file changed, 4 insertions(+), 3 deletions(-) + +--- a/kernel/rcu/tree.c ++++ b/kernel/rcu/tree.c +@@ -2278,13 +2278,13 @@ rcu_report_qs_rdp(struct rcu_data *rdp) + { + unsigned long flags; + unsigned long mask; +- bool needwake = false; +- const bool offloaded = rcu_rdp_is_offloaded(rdp); ++ bool offloaded, needwake = false; + struct rcu_node *rnp; + + WARN_ON_ONCE(rdp->cpu != smp_processor_id()); + rnp = rdp->mynode; + raw_spin_lock_irqsave_rcu_node(rnp, flags); ++ offloaded = rcu_rdp_is_offloaded(rdp); + if (rdp->cpu_no_qs.b.norm || rdp->gp_seq != rnp->gp_seq || + rdp->gpwrap) { + +@@ -2446,7 +2446,7 @@ static void rcu_do_batch(struct rcu_data + int div; + bool __maybe_unused empty; + unsigned long flags; +- const bool offloaded = rcu_rdp_is_offloaded(rdp); ++ bool offloaded; + struct rcu_head *rhp; + struct rcu_cblist rcl = RCU_CBLIST_INITIALIZER(rcl); + long bl, count = 0; +@@ -2472,6 +2472,7 @@ static void rcu_do_batch(struct rcu_data + rcu_nocb_lock(rdp); + WARN_ON_ONCE(cpu_is_offline(smp_processor_id())); + pending = rcu_segcblist_n_cbs(&rdp->cblist); ++ offloaded = rcu_rdp_is_offloaded(rdp); + div = READ_ONCE(rcu_divisor); + div = div < 0 ? 7 : div > sizeof(long) * 8 - 2 ? sizeof(long) * 8 - 2 : div; + bl = max(rdp->blimit, pending >> div); diff --git a/debian/patches-rt/rcu__Delay_RCU-selftests.patch b/debian/patches-rt/rcu__Delay_RCU-selftests.patch new file mode 100644 index 000000000..002686d77 --- /dev/null +++ b/debian/patches-rt/rcu__Delay_RCU-selftests.patch @@ -0,0 +1,77 @@ +Subject: rcu: Delay RCU-selftests +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Wed Mar 10 15:09:02 2021 +0100 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> + +Delay RCU-selftests until ksoftirqd is up and running. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + +--- + include/linux/rcupdate.h | 7 +++++++ + init/main.c | 1 + + kernel/rcu/tasks.h | 9 ++------- + 3 files changed, 10 insertions(+), 7 deletions(-) +--- +--- a/include/linux/rcupdate.h ++++ b/include/linux/rcupdate.h +@@ -94,6 +94,13 @@ void rcu_init_tasks_generic(void); + static inline void rcu_init_tasks_generic(void) { } + #endif + ++#if defined(CONFIG_PROVE_RCU) && defined(CONFIG_TASKS_RCU_GENERIC) ++void rcu_tasks_initiate_self_tests(void); ++#else ++static inline void rcu_tasks_initiate_self_tests(void) {} ++#endif ++ ++ + #ifdef CONFIG_RCU_STALL_COMMON + void rcu_sysrq_start(void); + void rcu_sysrq_end(void); +--- a/init/main.c ++++ b/init/main.c +@@ -1602,6 +1602,7 @@ static noinline void __init kernel_init_ + + rcu_init_tasks_generic(); + do_pre_smp_initcalls(); ++ rcu_tasks_initiate_self_tests(); + lockup_detector_init(); + + smp_init(); +--- a/kernel/rcu/tasks.h ++++ b/kernel/rcu/tasks.h +@@ -1348,7 +1348,7 @@ static void test_rcu_tasks_callback(stru + rttd->notrun = true; + } + +-static void rcu_tasks_initiate_self_tests(void) ++void rcu_tasks_initiate_self_tests(void) + { + pr_info("Running RCU-tasks wait API self tests\n"); + #ifdef CONFIG_TASKS_RCU +@@ -1385,9 +1385,7 @@ static int rcu_tasks_verify_self_tests(v + return ret; + } + late_initcall(rcu_tasks_verify_self_tests); +-#else /* #ifdef CONFIG_PROVE_RCU */ +-static void rcu_tasks_initiate_self_tests(void) { } +-#endif /* #else #ifdef CONFIG_PROVE_RCU */ ++#endif /* #ifdef CONFIG_PROVE_RCU */ + + void __init rcu_init_tasks_generic(void) + { +@@ -1402,9 +1400,6 @@ void __init rcu_init_tasks_generic(void) + #ifdef CONFIG_TASKS_TRACE_RCU + rcu_spawn_tasks_trace_kthread(); + #endif +- +- // Run the self-tests. +- rcu_tasks_initiate_self_tests(); + } + + #else /* #ifdef CONFIG_TASKS_RCU_GENERIC */ diff --git a/debian/patches-rt/samples_kfifo__Rename_read_lock_write_lock.patch b/debian/patches-rt/samples_kfifo__Rename_read_lock_write_lock.patch new file mode 100644 index 000000000..5bf7ddcd0 --- /dev/null +++ b/debian/patches-rt/samples_kfifo__Rename_read_lock_write_lock.patch @@ -0,0 +1,157 @@ +Subject: samples/kfifo: Rename read_lock/write_lock +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Thu Jul 1 17:43:16 2021 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> + +The variables names read_lock and write_lock can clash with functions used for +read/writer locks. + +Rename read_lock to read_access and write_lock to write_access to avoid a name +collision. + +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Link: https://lkml.kernel.org/r/20210806152551.qio7c3ho6pexezup@linutronix.de +--- + samples/kfifo/bytestream-example.c | 12 ++++++------ + samples/kfifo/inttype-example.c | 12 ++++++------ + samples/kfifo/record-example.c | 12 ++++++------ + 3 files changed, 18 insertions(+), 18 deletions(-) +--- +--- a/samples/kfifo/bytestream-example.c ++++ b/samples/kfifo/bytestream-example.c +@@ -22,10 +22,10 @@ + #define PROC_FIFO "bytestream-fifo" + + /* lock for procfs read access */ +-static DEFINE_MUTEX(read_lock); ++static DEFINE_MUTEX(read_access); + + /* lock for procfs write access */ +-static DEFINE_MUTEX(write_lock); ++static DEFINE_MUTEX(write_access); + + /* + * define DYNAMIC in this example for a dynamically allocated fifo. +@@ -116,12 +116,12 @@ static ssize_t fifo_write(struct file *f + int ret; + unsigned int copied; + +- if (mutex_lock_interruptible(&write_lock)) ++ if (mutex_lock_interruptible(&write_access)) + return -ERESTARTSYS; + + ret = kfifo_from_user(&test, buf, count, &copied); + +- mutex_unlock(&write_lock); ++ mutex_unlock(&write_access); + if (ret) + return ret; + +@@ -134,12 +134,12 @@ static ssize_t fifo_read(struct file *fi + int ret; + unsigned int copied; + +- if (mutex_lock_interruptible(&read_lock)) ++ if (mutex_lock_interruptible(&read_access)) + return -ERESTARTSYS; + + ret = kfifo_to_user(&test, buf, count, &copied); + +- mutex_unlock(&read_lock); ++ mutex_unlock(&read_access); + if (ret) + return ret; + +--- a/samples/kfifo/inttype-example.c ++++ b/samples/kfifo/inttype-example.c +@@ -22,10 +22,10 @@ + #define PROC_FIFO "int-fifo" + + /* lock for procfs read access */ +-static DEFINE_MUTEX(read_lock); ++static DEFINE_MUTEX(read_access); + + /* lock for procfs write access */ +-static DEFINE_MUTEX(write_lock); ++static DEFINE_MUTEX(write_access); + + /* + * define DYNAMIC in this example for a dynamically allocated fifo. +@@ -109,12 +109,12 @@ static ssize_t fifo_write(struct file *f + int ret; + unsigned int copied; + +- if (mutex_lock_interruptible(&write_lock)) ++ if (mutex_lock_interruptible(&write_access)) + return -ERESTARTSYS; + + ret = kfifo_from_user(&test, buf, count, &copied); + +- mutex_unlock(&write_lock); ++ mutex_unlock(&write_access); + if (ret) + return ret; + +@@ -127,12 +127,12 @@ static ssize_t fifo_read(struct file *fi + int ret; + unsigned int copied; + +- if (mutex_lock_interruptible(&read_lock)) ++ if (mutex_lock_interruptible(&read_access)) + return -ERESTARTSYS; + + ret = kfifo_to_user(&test, buf, count, &copied); + +- mutex_unlock(&read_lock); ++ mutex_unlock(&read_access); + if (ret) + return ret; + +--- a/samples/kfifo/record-example.c ++++ b/samples/kfifo/record-example.c +@@ -22,10 +22,10 @@ + #define PROC_FIFO "record-fifo" + + /* lock for procfs read access */ +-static DEFINE_MUTEX(read_lock); ++static DEFINE_MUTEX(read_access); + + /* lock for procfs write access */ +-static DEFINE_MUTEX(write_lock); ++static DEFINE_MUTEX(write_access); + + /* + * define DYNAMIC in this example for a dynamically allocated fifo. +@@ -123,12 +123,12 @@ static ssize_t fifo_write(struct file *f + int ret; + unsigned int copied; + +- if (mutex_lock_interruptible(&write_lock)) ++ if (mutex_lock_interruptible(&write_access)) + return -ERESTARTSYS; + + ret = kfifo_from_user(&test, buf, count, &copied); + +- mutex_unlock(&write_lock); ++ mutex_unlock(&write_access); + if (ret) + return ret; + +@@ -141,12 +141,12 @@ static ssize_t fifo_read(struct file *fi + int ret; + unsigned int copied; + +- if (mutex_lock_interruptible(&read_lock)) ++ if (mutex_lock_interruptible(&read_access)) + return -ERESTARTSYS; + + ret = kfifo_to_user(&test, buf, count, &copied); + +- mutex_unlock(&read_lock); ++ mutex_unlock(&read_access); + if (ret) + return ret; + diff --git a/debian/patches-rt/sched-Make-preempt_enable_no_resched-behave-like-pre.patch b/debian/patches-rt/sched-Make-preempt_enable_no_resched-behave-like-pre.patch new file mode 100644 index 000000000..88fe420df --- /dev/null +++ b/debian/patches-rt/sched-Make-preempt_enable_no_resched-behave-like-pre.patch @@ -0,0 +1,27 @@ +From: Thomas Gleixner <tglx@linutronix.de> +Date: Fri, 17 Sep 2021 12:56:01 +0200 +Subject: [PATCH] sched: Make preempt_enable_no_resched() behave like + preempt_enable() on PREEMPT_RT +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + include/linux/preempt.h | 6 +++++- + 1 file changed, 5 insertions(+), 1 deletion(-) + +--- a/include/linux/preempt.h ++++ b/include/linux/preempt.h +@@ -189,7 +189,11 @@ do { \ + preempt_count_dec(); \ + } while (0) + +-#define preempt_enable_no_resched() sched_preempt_enable_no_resched() ++#ifndef CONFIG_PREEMPT_RT ++# define preempt_enable_no_resched() sched_preempt_enable_no_resched() ++#else ++# define preempt_enable_no_resched() preempt_enable() ++#endif + + #define preemptible() (preempt_count() == 0 && !irqs_disabled()) + diff --git a/debian/patches-rt/sched-Switch-wait_task_inactive-to-HRTIMER_MODE_REL_.patch b/debian/patches-rt/sched-Switch-wait_task_inactive-to-HRTIMER_MODE_REL_.patch new file mode 100644 index 000000000..8c602262e --- /dev/null +++ b/debian/patches-rt/sched-Switch-wait_task_inactive-to-HRTIMER_MODE_REL_.patch @@ -0,0 +1,40 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Tue, 24 Aug 2021 22:47:37 +0200 +Subject: [PATCH] sched: Switch wait_task_inactive to HRTIMER_MODE_REL_HARD +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +With PREEMPT_RT enabled all hrtimers callbacks will be invoked in +softirq mode unless they are explicitly marked as HRTIMER_MODE_HARD. +During boot kthread_bind() is used for the creation of per-CPU threads +and then hangs in wait_task_inactive() if the ksoftirqd is not +yet up and running. +The hang disappeared since commit + 26c7295be0c5e ("kthread: Do not preempt current task if it is going to call schedule()") + +but enabling function trace on boot reliably leads to the freeze on boot +behaviour again. +The timer in wait_task_inactive() can not be directly used by an user +interface to abuse it and create a mass wake of several tasks at the +same time which would to long sections with disabled interrupts. +Therefore it is safe to make the timer HRTIMER_MODE_REL_HARD. + +Switch the timer to HRTIMER_MODE_REL_HARD. + +Cc: stable-rt@vger.kernel.org +Link: https://lkml.kernel.org/r/20210826170408.vm7rlj7odslshwch@linutronix.de +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + kernel/sched/core.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +--- a/kernel/sched/core.c ++++ b/kernel/sched/core.c +@@ -3250,7 +3250,7 @@ unsigned long wait_task_inactive(struct + ktime_t to = NSEC_PER_SEC / HZ; + + set_current_state(TASK_UNINTERRUPTIBLE); +- schedule_hrtimeout(&to, HRTIMER_MODE_REL); ++ schedule_hrtimeout(&to, HRTIMER_MODE_REL_HARD); + continue; + } + diff --git a/debian/patches-rt/0265-sched-Add-support-for-lazy-preemption.patch b/debian/patches-rt/sched__Add_support_for_lazy_preemption.patch index 6aae5e101..3dfb561e7 100644 --- a/debian/patches-rt/0265-sched-Add-support-for-lazy-preemption.patch +++ b/debian/patches-rt/sched__Add_support_for_lazy_preemption.patch @@ -1,8 +1,9 @@ -From 6e6ff767ad18415853867e07959bd151601ac9a8 Mon Sep 17 00:00:00 2001 +Subject: sched: Add support for lazy preemption +From: Thomas Gleixner <tglx@linutronix.de> +Date: Fri Oct 26 18:50:54 2012 +0100 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Thomas Gleixner <tglx@linutronix.de> -Date: Fri, 26 Oct 2012 18:50:54 +0100 -Subject: [PATCH 265/296] sched: Add support for lazy preemption -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz It has become an obsession to mitigate the determinism vs. throughput loss of RT. Looking at the mainline semantics of preemption points @@ -53,26 +54,26 @@ there is a clear trend that it enhances the non RT workload performance. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - include/linux/preempt.h | 54 ++++++++++++++++++++++-- - include/linux/sched.h | 38 +++++++++++++++++ - include/linux/thread_info.h | 12 +++++- - include/linux/trace_events.h | 5 ++- - kernel/Kconfig.preempt | 6 +++ - kernel/sched/core.c | 82 +++++++++++++++++++++++++++++++++++- - kernel/sched/fair.c | 16 +++---- - kernel/sched/features.h | 3 ++ - kernel/sched/sched.h | 9 ++++ - kernel/trace/trace.c | 50 +++++++++++++--------- - kernel/trace/trace_events.c | 1 + - kernel/trace/trace_output.c | 14 +++++- - 12 files changed, 254 insertions(+), 36 deletions(-) -diff --git a/include/linux/preempt.h b/include/linux/preempt.h -index fb140e00f74d..af39859f02ee 100644 + +--- + include/linux/preempt.h | 54 +++++++++++++++++++++++++++-- + include/linux/sched.h | 37 +++++++++++++++++++ + include/linux/thread_info.h | 12 +++++- + include/linux/trace_events.h | 5 ++ + kernel/Kconfig.preempt | 6 +++ + kernel/sched/core.c | 80 +++++++++++++++++++++++++++++++++++++++++-- + kernel/sched/fair.c | 16 ++++---- + kernel/sched/features.h | 3 + + kernel/sched/sched.h | 9 ++++ + kernel/trace/trace.c | 46 +++++++++++++++--------- + kernel/trace/trace_events.c | 1 + kernel/trace/trace_output.c | 14 ++++++- + 12 files changed, 248 insertions(+), 35 deletions(-) +--- --- a/include/linux/preempt.h +++ b/include/linux/preempt.h -@@ -174,6 +174,20 @@ extern void preempt_count_sub(int val); +@@ -175,6 +175,20 @@ extern void preempt_count_sub(int val); #define preempt_count_inc() preempt_count_add(1) #define preempt_count_dec() preempt_count_sub(1) @@ -93,7 +94,7 @@ index fb140e00f74d..af39859f02ee 100644 #ifdef CONFIG_PREEMPT_COUNT #define preempt_disable() \ -@@ -182,6 +196,12 @@ do { \ +@@ -183,6 +197,12 @@ do { \ barrier(); \ } while (0) @@ -106,7 +107,7 @@ index fb140e00f74d..af39859f02ee 100644 #define sched_preempt_enable_no_resched() \ do { \ barrier(); \ -@@ -219,6 +239,18 @@ do { \ +@@ -220,6 +240,18 @@ do { \ __preempt_schedule(); \ } while (0) @@ -125,7 +126,7 @@ index fb140e00f74d..af39859f02ee 100644 #else /* !CONFIG_PREEMPTION */ #define preempt_enable() \ do { \ -@@ -226,6 +258,12 @@ do { \ +@@ -227,6 +259,12 @@ do { \ preempt_count_dec(); \ } while (0) @@ -138,7 +139,7 @@ index fb140e00f74d..af39859f02ee 100644 #define preempt_enable_notrace() \ do { \ barrier(); \ -@@ -267,6 +305,9 @@ do { \ +@@ -268,6 +306,9 @@ do { \ #define preempt_check_resched_rt() barrier() #define preemptible() 0 @@ -148,7 +149,7 @@ index fb140e00f74d..af39859f02ee 100644 #endif /* CONFIG_PREEMPT_COUNT */ #ifdef MODULE -@@ -285,7 +326,7 @@ do { \ +@@ -286,7 +327,7 @@ do { \ } while (0) #define preempt_fold_need_resched() \ do { \ @@ -157,7 +158,7 @@ index fb140e00f74d..af39859f02ee 100644 set_preempt_need_resched(); \ } while (0) -@@ -413,8 +454,15 @@ extern void migrate_enable(void); +@@ -410,8 +451,15 @@ extern void migrate_enable(void); #else @@ -175,11 +176,9 @@ index fb140e00f74d..af39859f02ee 100644 #endif /* CONFIG_SMP */ -diff --git a/include/linux/sched.h b/include/linux/sched.h -index 05d0d463f30b..28509a37eb71 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h -@@ -1872,6 +1872,44 @@ static inline int test_tsk_need_resched(struct task_struct *tsk) +@@ -2015,6 +2015,43 @@ static inline int test_tsk_need_resched( return unlikely(test_tsk_thread_flag(tsk,TIF_NEED_RESCHED)); } @@ -220,24 +219,21 @@ index 05d0d463f30b..28509a37eb71 100644 + +#endif + -+ - static inline bool __task_is_stopped_or_traced(struct task_struct *task) + #ifdef CONFIG_PREEMPT_RT + static inline bool task_match_saved_state(struct task_struct *p, long match_state) { - if (task->state & (__TASK_STOPPED | __TASK_TRACED)) -diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h -index f3040b0b4b23..3cb02ced141b 100644 --- a/include/linux/thread_info.h +++ b/include/linux/thread_info.h -@@ -110,7 +110,17 @@ static inline int test_ti_thread_flag(struct thread_info *ti, int flag) - #define test_thread_flag(flag) \ - test_ti_thread_flag(current_thread_info(), flag) +@@ -163,7 +163,17 @@ static inline int test_ti_thread_flag(st + clear_ti_thread_flag(task_thread_info(t), TIF_##fl) + #endif /* !CONFIG_GENERIC_ENTRY */ -#define tif_need_resched() test_thread_flag(TIF_NEED_RESCHED) +#ifdef CONFIG_PREEMPT_LAZY +#define tif_need_resched() (test_thread_flag(TIF_NEED_RESCHED) || \ + test_thread_flag(TIF_NEED_RESCHED_LAZY)) +#define tif_need_resched_now() (test_thread_flag(TIF_NEED_RESCHED)) -+#define tif_need_resched_lazy() test_thread_flag(TIF_NEED_RESCHED_LAZY)) ++#define tif_need_resched_lazy() test_thread_flag(TIF_NEED_RESCHED_LAZY) + +#else +#define tif_need_resched() test_thread_flag(TIF_NEED_RESCHED) @@ -247,22 +243,20 @@ index f3040b0b4b23..3cb02ced141b 100644 #ifndef CONFIG_HAVE_ARCH_WITHIN_STACK_FRAMES static inline int arch_within_stack_frames(const void * const stack, -diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h -index 29c24ec33ffd..89c3f7162267 100644 --- a/include/linux/trace_events.h +++ b/include/linux/trace_events.h -@@ -68,6 +68,7 @@ struct trace_entry { +@@ -69,6 +69,7 @@ struct trace_entry { + unsigned char flags; unsigned char preempt_count; int pid; - unsigned char migrate_disable; + unsigned char preempt_lazy_count; }; #define TRACE_EVENT_TYPE_MAX \ -@@ -155,9 +156,10 @@ static inline void tracing_generic_entry_update(struct trace_entry *entry, +@@ -157,9 +158,10 @@ static inline void tracing_generic_entry + unsigned int trace_ctx) { entry->preempt_count = trace_ctx & 0xff; - entry->migrate_disable = (trace_ctx >> 8) & 0xff; + entry->preempt_lazy_count = (trace_ctx >> 16) & 0xff; entry->pid = current->pid; entry->type = type; @@ -271,7 +265,7 @@ index 29c24ec33ffd..89c3f7162267 100644 } unsigned int tracing_gen_ctx_irq_test(unsigned int irqs_status); -@@ -170,6 +172,7 @@ enum trace_flag_type { +@@ -172,6 +174,7 @@ enum trace_flag_type { TRACE_FLAG_SOFTIRQ = 0x10, TRACE_FLAG_PREEMPT_RESCHED = 0x20, TRACE_FLAG_NMI = 0x40, @@ -279,8 +273,6 @@ index 29c24ec33ffd..89c3f7162267 100644 }; #ifdef CONFIG_TRACE_IRQFLAGS_SUPPORT -diff --git a/kernel/Kconfig.preempt b/kernel/Kconfig.preempt -index cbe3aa495519..b5cd1e278eb5 100644 --- a/kernel/Kconfig.preempt +++ b/kernel/Kconfig.preempt @@ -1,5 +1,11 @@ @@ -295,11 +287,9 @@ index cbe3aa495519..b5cd1e278eb5 100644 choice prompt "Preemption Model" default PREEMPT_NONE -diff --git a/kernel/sched/core.c b/kernel/sched/core.c -index 7fc1b5eefd3e..f6b931d82443 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c -@@ -655,6 +655,48 @@ void resched_curr(struct rq *rq) +@@ -986,6 +986,46 @@ void resched_curr(struct rq *rq) trace_sched_wake_idle_without_ipi(cpu); } @@ -324,8 +314,6 @@ index 7fc1b5eefd3e..f6b931d82443 100644 + return; + } + -+ lockdep_assert_held(&rq->lock); -+ + if (test_tsk_need_resched(curr)) + return; + @@ -348,7 +336,7 @@ index 7fc1b5eefd3e..f6b931d82443 100644 void resched_cpu(int cpu) { struct rq *rq = cpu_rq(cpu); -@@ -1756,6 +1798,7 @@ void migrate_disable(void) +@@ -2141,6 +2181,7 @@ void migrate_disable(void) preempt_disable(); this_rq()->nr_pinned++; p->migration_disabled = 1; @@ -356,15 +344,15 @@ index 7fc1b5eefd3e..f6b931d82443 100644 preempt_enable(); } EXPORT_SYMBOL_GPL(migrate_disable); -@@ -1784,6 +1827,7 @@ void migrate_enable(void) +@@ -2171,6 +2212,7 @@ void migrate_enable(void) barrier(); p->migration_disabled = 0; this_rq()->nr_pinned--; + preempt_lazy_enable(); preempt_enable(); - - trace_sched_migrate_enable_tp(p); -@@ -3819,6 +3863,9 @@ int sched_fork(unsigned long clone_flags, struct task_struct *p) + } + EXPORT_SYMBOL_GPL(migrate_enable); +@@ -4406,6 +4448,9 @@ int sched_fork(unsigned long clone_flags p->on_cpu = 0; #endif init_task_preempt_count(p); @@ -374,15 +362,15 @@ index 7fc1b5eefd3e..f6b931d82443 100644 #ifdef CONFIG_SMP plist_node_init(&p->pushable_tasks, MAX_PRIO); RB_CLEAR_NODE(&p->pushable_dl_tasks); -@@ -5078,6 +5125,7 @@ static void __sched notrace __schedule(bool preempt, bool spinning_lock) +@@ -6253,6 +6298,7 @@ static void __sched notrace __schedule(u next = pick_next_task(rq, prev, &rf); clear_tsk_need_resched(prev); + clear_tsk_need_resched_lazy(prev); clear_preempt_need_resched(); - - if (likely(prev != next)) { -@@ -5277,6 +5325,30 @@ static void __sched notrace preempt_schedule_common(void) + #ifdef CONFIG_SCHED_DEBUG + rq->last_seen_need_resched_ns = 0; +@@ -6470,6 +6516,30 @@ static void __sched notrace preempt_sche } while (need_resched()); } @@ -413,7 +401,7 @@ index 7fc1b5eefd3e..f6b931d82443 100644 #ifdef CONFIG_PREEMPTION /* * This is the entry point to schedule() from in-kernel preemption -@@ -5290,7 +5362,8 @@ asmlinkage __visible void __sched notrace preempt_schedule(void) +@@ -6483,7 +6553,8 @@ asmlinkage __visible void __sched notrac */ if (likely(!preemptible())) return; @@ -423,7 +411,7 @@ index 7fc1b5eefd3e..f6b931d82443 100644 preempt_schedule_common(); } NOKPROBE_SYMBOL(preempt_schedule); -@@ -5330,6 +5403,9 @@ asmlinkage __visible void __sched notrace preempt_schedule_notrace(void) +@@ -6516,6 +6587,9 @@ asmlinkage __visible void __sched notrac if (likely(!preemptible())) return; @@ -433,7 +421,7 @@ index 7fc1b5eefd3e..f6b931d82443 100644 do { /* * Because the function tracer can trace preempt_count_sub() -@@ -7165,7 +7241,9 @@ void init_idle(struct task_struct *idle, int cpu) +@@ -8677,7 +8751,9 @@ void __init init_idle(struct task_struct /* Set the preempt count _outside_ the spinlocks! */ init_idle_preempt_count(idle, cpu); @@ -444,11 +432,9 @@ index 7fc1b5eefd3e..f6b931d82443 100644 /* * The idle tasks have their own, simple scheduling class: */ -diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c -index 348605306027..e7d6ae7882c1 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c -@@ -4372,7 +4372,7 @@ check_preempt_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr) +@@ -4445,7 +4445,7 @@ check_preempt_tick(struct cfs_rq *cfs_rq ideal_runtime = sched_slice(cfs_rq, curr); delta_exec = curr->sum_exec_runtime - curr->prev_sum_exec_runtime; if (delta_exec > ideal_runtime) { @@ -457,7 +443,7 @@ index 348605306027..e7d6ae7882c1 100644 /* * The current task ran long enough, ensure it doesn't get * re-elected due to buddy favours. -@@ -4396,7 +4396,7 @@ check_preempt_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr) +@@ -4469,7 +4469,7 @@ check_preempt_tick(struct cfs_rq *cfs_rq return; if (delta > ideal_runtime) @@ -466,7 +452,7 @@ index 348605306027..e7d6ae7882c1 100644 } static void -@@ -4539,7 +4539,7 @@ entity_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr, int queued) +@@ -4612,7 +4612,7 @@ entity_tick(struct cfs_rq *cfs_rq, struc * validating it and just reschedule. */ if (queued) { @@ -475,7 +461,7 @@ index 348605306027..e7d6ae7882c1 100644 return; } /* -@@ -4676,7 +4676,7 @@ static void __account_cfs_rq_runtime(struct cfs_rq *cfs_rq, u64 delta_exec) +@@ -4752,7 +4752,7 @@ static void __account_cfs_rq_runtime(str * hierarchy can be throttled */ if (!assign_cfs_rq_runtime(cfs_rq) && likely(cfs_rq->curr)) @@ -484,16 +470,16 @@ index 348605306027..e7d6ae7882c1 100644 } static __always_inline -@@ -5411,7 +5411,7 @@ static void hrtick_start_fair(struct rq *rq, struct task_struct *p) +@@ -5515,7 +5515,7 @@ static void hrtick_start_fair(struct rq if (delta < 0) { - if (rq->curr == p) + if (task_current(rq, p)) - resched_curr(rq); + resched_curr_lazy(rq); return; } hrtick_start(rq, delta); -@@ -6992,7 +6992,7 @@ static void check_preempt_wakeup(struct rq *rq, struct task_struct *p, int wake_ +@@ -7205,7 +7205,7 @@ static void check_preempt_wakeup(struct return; preempt: @@ -502,7 +488,7 @@ index 348605306027..e7d6ae7882c1 100644 /* * Only set the backward buddy when the current task is still * on the rq. This can happen when a wakeup gets interleaved -@@ -10749,7 +10749,7 @@ static void task_fork_fair(struct task_struct *p) +@@ -11106,7 +11106,7 @@ static void task_fork_fair(struct task_s * 'current' within the tree based on its new key value. */ swap(curr->vruntime, se->vruntime); @@ -511,20 +497,18 @@ index 348605306027..e7d6ae7882c1 100644 } se->vruntime -= cfs_rq->min_vruntime; -@@ -10776,7 +10776,7 @@ prio_changed_fair(struct rq *rq, struct task_struct *p, int oldprio) +@@ -11133,7 +11133,7 @@ prio_changed_fair(struct rq *rq, struct */ - if (rq->curr == p) { + if (task_current(rq, p)) { if (p->prio > oldprio) - resched_curr(rq); + resched_curr_lazy(rq); } else check_preempt_curr(rq, p, 0); } -diff --git a/kernel/sched/features.h b/kernel/sched/features.h -index 296aea55f44c..5a2e27297126 100644 --- a/kernel/sched/features.h +++ b/kernel/sched/features.h -@@ -47,6 +47,9 @@ SCHED_FEAT(NONTASK_CAPACITY, true) +@@ -48,6 +48,9 @@ SCHED_FEAT(NONTASK_CAPACITY, true) #ifdef CONFIG_PREEMPT_RT SCHED_FEAT(TTWU_QUEUE, false) @@ -534,11 +518,9 @@ index 296aea55f44c..5a2e27297126 100644 #else /* -diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h -index 32ac4fc0ce76..a6dc180ae5ef 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h -@@ -1988,6 +1988,15 @@ extern void reweight_task(struct task_struct *p, int prio); +@@ -2317,6 +2317,15 @@ extern void reweight_task(struct task_st extern void resched_curr(struct rq *rq); extern void resched_cpu(int cpu); @@ -554,30 +536,24 @@ index 32ac4fc0ce76..a6dc180ae5ef 100644 extern struct rt_bandwidth def_rt_bandwidth; extern void init_rt_bandwidth(struct rt_bandwidth *rt_b, u64 period, u64 runtime); -diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c -index 255b3f027dfd..b3cc666223ce 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c -@@ -2605,8 +2605,16 @@ unsigned int tracing_gen_ctx_irq_test(unsigned int irqs_status) +@@ -2629,7 +2629,13 @@ unsigned int tracing_gen_ctx_irq_test(un trace_flags |= TRACE_FLAG_NEED_RESCHED; if (test_preempt_need_resched()) trace_flags |= TRACE_FLAG_PREEMPT_RESCHED; -- return (trace_flags << 16) | (pc & 0xff) | -- (migration_disable_value() & 0xff) << 8; -+ +- return (trace_flags << 16) | (min_t(unsigned int, pc & 0xff, 0xf)) | +#ifdef CONFIG_PREEMPT_LAZY + if (need_resched_lazy()) + trace_flags |= TRACE_FLAG_NEED_RESCHED_LAZY; +#endif + -+ return (pc & 0xff) | -+ (migration_disable_value() & 0xff) << 8 | ++ return (trace_flags << 24) | (min_t(unsigned int, pc & 0xff, 0xf)) | + (preempt_lazy_count() & 0xff) << 16 | -+ (trace_flags << 24); + (min_t(unsigned int, migration_disable_value(), 0xf)) << 4; } - struct ring_buffer_event * -@@ -3808,15 +3816,17 @@ unsigned long trace_total_entries(struct trace_array *tr) +@@ -4193,15 +4199,17 @@ unsigned long trace_total_entries(struct static void print_lat_help_header(struct seq_file *m) { @@ -604,7 +580,7 @@ index 255b3f027dfd..b3cc666223ce 100644 } static void print_event_info(struct array_buffer *buf, struct seq_file *m) -@@ -3850,14 +3860,16 @@ static void print_func_help_header_irq(struct array_buffer *buf, struct seq_file +@@ -4235,14 +4243,16 @@ static void print_func_help_header_irq(s print_event_info(buf, m); @@ -629,23 +605,19 @@ index 255b3f027dfd..b3cc666223ce 100644 } void -diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c -index 76e8981edac3..7cfcf301b6e6 100644 --- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c -@@ -184,6 +184,7 @@ static int trace_define_common_fields(void) +@@ -184,6 +184,7 @@ static int trace_define_common_fields(vo + /* Holds both preempt_count and migrate_disable */ __common_field(unsigned char, preempt_count); __common_field(int, pid); - __common_field(unsigned char, migrate_disable); + __common_field(unsigned char, preempt_lazy_count); return ret; } -diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c -index ad738192647b..bc24ae8e3613 100644 --- a/kernel/trace/trace_output.c +++ b/kernel/trace/trace_output.c -@@ -441,6 +441,7 @@ int trace_print_lat_fmt(struct trace_seq *s, struct trace_entry *entry) +@@ -451,6 +451,7 @@ int trace_print_lat_fmt(struct trace_seq { char hardsoft_irq; char need_resched; @@ -653,7 +625,7 @@ index ad738192647b..bc24ae8e3613 100644 char irqs_off; int hardirq; int softirq; -@@ -471,6 +472,9 @@ int trace_print_lat_fmt(struct trace_seq *s, struct trace_entry *entry) +@@ -481,6 +482,9 @@ int trace_print_lat_fmt(struct trace_seq break; } @@ -663,7 +635,7 @@ index ad738192647b..bc24ae8e3613 100644 hardsoft_irq = (nmi && hardirq) ? 'Z' : nmi ? 'z' : -@@ -479,14 +483,20 @@ int trace_print_lat_fmt(struct trace_seq *s, struct trace_entry *entry) +@@ -489,14 +493,20 @@ int trace_print_lat_fmt(struct trace_seq softirq ? 's' : '.' ; @@ -673,8 +645,8 @@ index ad738192647b..bc24ae8e3613 100644 + irqs_off, need_resched, need_resched_lazy, + hardsoft_irq); - if (entry->preempt_count) - trace_seq_printf(s, "%x", entry->preempt_count); + if (entry->preempt_count & 0xf) + trace_seq_printf(s, "%x", entry->preempt_count & 0xf); else trace_seq_putc(s, '.'); @@ -683,9 +655,6 @@ index ad738192647b..bc24ae8e3613 100644 + else + trace_seq_putc(s, '.'); + - if (entry->migrate_disable) - trace_seq_printf(s, "%x", entry->migrate_disable); + if (entry->preempt_count & 0xf0) + trace_seq_printf(s, "%x", entry->preempt_count >> 4); else --- -2.30.2 - diff --git a/debian/patches-rt/sched_introduce_migratable.patch b/debian/patches-rt/sched_introduce_migratable.patch new file mode 100644 index 000000000..8f46414f6 --- /dev/null +++ b/debian/patches-rt/sched_introduce_migratable.patch @@ -0,0 +1,46 @@ +From: Valentin Schneider <valentin.schneider@arm.com> +Subject: sched: Introduce migratable() +Date: Wed, 11 Aug 2021 21:13:52 +0100 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +Some areas use preempt_disable() + preempt_enable() to safely access +per-CPU data. The PREEMPT_RT folks have shown this can also be done by +keeping preemption enabled and instead disabling migration (and acquiring a +sleepable lock, if relevant). + +Introduce a helper which checks whether the current task can be migrated +elsewhere, IOW if it is pinned to its local CPU in the current +context. This can help determining if per-CPU properties can be safely +accessed. + +Note that CPU affinity is not checked here, as a preemptible task can have +its affinity changed at any given time (including if it has +PF_NO_SETAFFINITY, when hotplug gets involved). + +Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> +[bigeasy: Return false on UP, call it is_migratable().] +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20210811201354.1976839-3-valentin.schneider@arm.com +--- + include/linux/sched.h | 10 ++++++++++ + 1 file changed, 10 insertions(+) + +--- a/include/linux/sched.h ++++ b/include/linux/sched.h +@@ -1730,6 +1730,16 @@ static __always_inline bool is_percpu_th + #endif + } + ++/* Is the current task guaranteed to stay on its current CPU? */ ++static inline bool is_migratable(void) ++{ ++#ifdef CONFIG_SMP ++ return preemptible() && !current->migration_disabled; ++#else ++ return false; ++#endif ++} ++ + /* Per-process atomic flags. */ + #define PFA_NO_NEW_PRIVS 0 /* May not gain new privileges. */ + #define PFA_SPREAD_PAGE 1 /* Spread page cache over cpuset */ diff --git a/debian/patches-rt/0236-scsi-fcoe-Make-RT-aware.patch b/debian/patches-rt/scsi_fcoe__Make_RT_aware..patch index b188badc3..db9850dde 100644 --- a/debian/patches-rt/0236-scsi-fcoe-Make-RT-aware.patch +++ b/debian/patches-rt/scsi_fcoe__Make_RT_aware..patch @@ -1,24 +1,25 @@ -From e75dba7886ff29143dfcca6e5399039811129645 Mon Sep 17 00:00:00 2001 +Subject: scsi/fcoe: Make RT aware. +From: Thomas Gleixner <tglx@linutronix.de> +Date: Sat Nov 12 14:00:48 2011 +0100 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Thomas Gleixner <tglx@linutronix.de> -Date: Sat, 12 Nov 2011 14:00:48 +0100 -Subject: [PATCH 236/296] scsi/fcoe: Make RT aware. -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz Do not disable preemption while taking sleeping locks. All user look safe for migrate_diable() only. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - drivers/scsi/fcoe/fcoe.c | 16 ++++++++-------- - drivers/scsi/fcoe/fcoe_ctlr.c | 4 ++-- - drivers/scsi/libfc/fc_exch.c | 4 ++-- + drivers/scsi/fcoe/fcoe.c | 16 ++++++++-------- + drivers/scsi/fcoe/fcoe_ctlr.c | 4 ++-- + drivers/scsi/libfc/fc_exch.c | 4 ++-- 3 files changed, 12 insertions(+), 12 deletions(-) - -diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c -index 0f9274960dc6..dc97e4f1f4ad 100644 +--- --- a/drivers/scsi/fcoe/fcoe.c +++ b/drivers/scsi/fcoe/fcoe.c -@@ -1452,11 +1452,11 @@ static int fcoe_rcv(struct sk_buff *skb, struct net_device *netdev, +@@ -1450,11 +1450,11 @@ static int fcoe_rcv(struct sk_buff *skb, static int fcoe_alloc_paged_crc_eof(struct sk_buff *skb, int tlen) { struct fcoe_percpu_s *fps; @@ -33,7 +34,7 @@ index 0f9274960dc6..dc97e4f1f4ad 100644 return rc; } -@@ -1641,11 +1641,11 @@ static inline int fcoe_filter_frames(struct fc_lport *lport, +@@ -1639,11 +1639,11 @@ static inline int fcoe_filter_frames(str return 0; } @@ -47,7 +48,7 @@ index 0f9274960dc6..dc97e4f1f4ad 100644 return -EINVAL; } -@@ -1686,7 +1686,7 @@ static void fcoe_recv_frame(struct sk_buff *skb) +@@ -1684,7 +1684,7 @@ static void fcoe_recv_frame(struct sk_bu */ hp = (struct fcoe_hdr *) skb_network_header(skb); @@ -56,7 +57,7 @@ index 0f9274960dc6..dc97e4f1f4ad 100644 if (unlikely(FC_FCOE_DECAPS_VER(hp) != FC_FCOE_VER)) { if (stats->ErrorFrames < 5) printk(KERN_WARNING "fcoe: FCoE version " -@@ -1718,13 +1718,13 @@ static void fcoe_recv_frame(struct sk_buff *skb) +@@ -1716,13 +1716,13 @@ static void fcoe_recv_frame(struct sk_bu goto drop; if (!fcoe_filter_frames(lport, fp)) { @@ -72,11 +73,9 @@ index 0f9274960dc6..dc97e4f1f4ad 100644 kfree_skb(skb); } -diff --git a/drivers/scsi/fcoe/fcoe_ctlr.c b/drivers/scsi/fcoe/fcoe_ctlr.c -index 5ea426effa60..0d6b9acc7cf8 100644 --- a/drivers/scsi/fcoe/fcoe_ctlr.c +++ b/drivers/scsi/fcoe/fcoe_ctlr.c -@@ -828,7 +828,7 @@ static unsigned long fcoe_ctlr_age_fcfs(struct fcoe_ctlr *fip) +@@ -828,7 +828,7 @@ static unsigned long fcoe_ctlr_age_fcfs( INIT_LIST_HEAD(&del_list); @@ -85,7 +84,7 @@ index 5ea426effa60..0d6b9acc7cf8 100644 list_for_each_entry_safe(fcf, next, &fip->fcfs, list) { deadline = fcf->time + fcf->fka_period + fcf->fka_period / 2; -@@ -864,7 +864,7 @@ static unsigned long fcoe_ctlr_age_fcfs(struct fcoe_ctlr *fip) +@@ -864,7 +864,7 @@ static unsigned long fcoe_ctlr_age_fcfs( sel_time = fcf->time; } } @@ -94,11 +93,9 @@ index 5ea426effa60..0d6b9acc7cf8 100644 list_for_each_entry_safe(fcf, next, &del_list, list) { /* Removes fcf from current list */ -diff --git a/drivers/scsi/libfc/fc_exch.c b/drivers/scsi/libfc/fc_exch.c -index a50f1eef0e0c..0b2acad7c354 100644 --- a/drivers/scsi/libfc/fc_exch.c +++ b/drivers/scsi/libfc/fc_exch.c -@@ -826,10 +826,10 @@ static struct fc_exch *fc_exch_em_alloc(struct fc_lport *lport, +@@ -825,10 +825,10 @@ static struct fc_exch *fc_exch_em_alloc( } memset(ep, 0, sizeof(*ep)); @@ -111,6 +108,3 @@ index a50f1eef0e0c..0b2acad7c354 100644 /* peek cache of free slot */ if (pool->left != FC_XID_UNKNOWN) { --- -2.30.2 - diff --git a/debian/patches-rt/0101-serial-8250-implement-write_atomic.patch b/debian/patches-rt/serial__8250__implement_write_atomic.patch index ae0039e4d..d872f47fd 100644 --- a/debian/patches-rt/0101-serial-8250-implement-write_atomic.patch +++ b/debian/patches-rt/serial__8250__implement_write_atomic.patch @@ -1,8 +1,9 @@ -From 1d6e8d42676fcbd888636812dddc619793478df9 Mon Sep 17 00:00:00 2001 +Subject: serial: 8250: implement write_atomic +From: John Ogness <john.ogness@linutronix.de> +Date: Mon Nov 30 01:42:02 2020 +0106 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: John Ogness <john.ogness@linutronix.de> -Date: Mon, 30 Nov 2020 01:42:02 +0106 -Subject: [PATCH 101/296] serial: 8250: implement write_atomic -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz Implement a non-sleeping NMI-safe write_atomic() console function in order to support emergency console printing. @@ -15,22 +16,21 @@ write_atomic() can be called from an NMI context that has preempted write_atomic(). Signed-off-by: John Ogness <john.ogness@linutronix.de> -Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + --- - drivers/tty/serial/8250/8250.h | 47 ++++++++++++- - drivers/tty/serial/8250/8250_core.c | 17 +++-- - drivers/tty/serial/8250/8250_fsl.c | 9 +++ - drivers/tty/serial/8250/8250_ingenic.c | 7 ++ - drivers/tty/serial/8250/8250_mtk.c | 29 +++++++- - drivers/tty/serial/8250/8250_port.c | 92 ++++++++++++++++---------- - include/linux/serial_8250.h | 5 ++ + drivers/tty/serial/8250/8250.h | 47 ++++++++++++++++ + drivers/tty/serial/8250/8250_core.c | 17 ++++-- + drivers/tty/serial/8250/8250_fsl.c | 9 +++ + drivers/tty/serial/8250/8250_ingenic.c | 7 ++ + drivers/tty/serial/8250/8250_mtk.c | 29 +++++++++- + drivers/tty/serial/8250/8250_port.c | 92 ++++++++++++++++++++------------- + include/linux/serial_8250.h | 5 + 7 files changed, 162 insertions(+), 44 deletions(-) - -diff --git a/drivers/tty/serial/8250/8250.h b/drivers/tty/serial/8250/8250.h -index 52bb21205bb6..5cbcaafbb4aa 100644 +--- --- a/drivers/tty/serial/8250/8250.h +++ b/drivers/tty/serial/8250/8250.h -@@ -130,12 +130,55 @@ static inline void serial_dl_write(struct uart_8250_port *up, int value) +@@ -132,12 +132,55 @@ static inline void serial_dl_write(struc up->dl_write(up, value); } @@ -38,13 +38,13 @@ index 52bb21205bb6..5cbcaafbb4aa 100644 + unsigned char ier) +{ + struct uart_port *port = &up->port; -+ unsigned int flags; ++ unsigned long flags; + bool is_console; + + is_console = uart_console(port); + + if (is_console) -+ console_atomic_lock(&flags); ++ console_atomic_lock(flags); + + serial_out(up, UART_IER, ier); + @@ -56,8 +56,8 @@ index 52bb21205bb6..5cbcaafbb4aa 100644 +{ + struct uart_port *port = &up->port; + unsigned int clearval = 0; ++ unsigned long flags; + unsigned int prior; -+ unsigned int flags; + bool is_console; + + is_console = uart_console(port); @@ -66,7 +66,7 @@ index 52bb21205bb6..5cbcaafbb4aa 100644 + clearval = UART_IER_UUE; + + if (is_console) -+ console_atomic_lock(&flags); ++ console_atomic_lock(flags); + + prior = serial_port_in(port, UART_IER); + serial_port_out(port, UART_IER, clearval); @@ -87,7 +87,7 @@ index 52bb21205bb6..5cbcaafbb4aa 100644 return true; } -@@ -144,7 +187,7 @@ static inline bool serial8250_clear_THRI(struct uart_8250_port *up) +@@ -146,7 +189,7 @@ static inline bool serial8250_clear_THRI if (!(up->ier & UART_IER_THRI)) return false; up->ier &= ~UART_IER_THRI; @@ -96,11 +96,9 @@ index 52bb21205bb6..5cbcaafbb4aa 100644 return true; } -diff --git a/drivers/tty/serial/8250/8250_core.c b/drivers/tty/serial/8250/8250_core.c -index cae61d1ebec5..47dd23056271 100644 --- a/drivers/tty/serial/8250/8250_core.c +++ b/drivers/tty/serial/8250/8250_core.c -@@ -274,10 +274,8 @@ static void serial8250_backup_timeout(struct timer_list *t) +@@ -264,10 +264,8 @@ static void serial8250_backup_timeout(st * Must disable interrupts or else we risk racing with the interrupt * based handler. */ @@ -113,7 +111,7 @@ index cae61d1ebec5..47dd23056271 100644 iir = serial_in(up, UART_IIR); -@@ -300,7 +298,7 @@ static void serial8250_backup_timeout(struct timer_list *t) +@@ -290,7 +288,7 @@ static void serial8250_backup_timeout(st serial8250_tx_chars(up); if (up->port.irq) @@ -122,7 +120,7 @@ index cae61d1ebec5..47dd23056271 100644 spin_unlock_irqrestore(&up->port.lock, flags); -@@ -578,6 +576,14 @@ serial8250_register_ports(struct uart_driver *drv, struct device *dev) +@@ -568,6 +566,14 @@ serial8250_register_ports(struct uart_dr #ifdef CONFIG_SERIAL_8250_CONSOLE @@ -137,7 +135,7 @@ index cae61d1ebec5..47dd23056271 100644 static void univ8250_console_write(struct console *co, const char *s, unsigned int count) { -@@ -671,6 +677,7 @@ static int univ8250_console_match(struct console *co, char *name, int idx, +@@ -661,6 +667,7 @@ static int univ8250_console_match(struct static struct console univ8250_console = { .name = "ttyS", @@ -145,73 +143,67 @@ index cae61d1ebec5..47dd23056271 100644 .write = univ8250_console_write, .device = uart_console_device, .setup = univ8250_console_setup, -diff --git a/drivers/tty/serial/8250/8250_fsl.c b/drivers/tty/serial/8250/8250_fsl.c -index fbcc90c31ca1..b33cb454ce03 100644 --- a/drivers/tty/serial/8250/8250_fsl.c +++ b/drivers/tty/serial/8250/8250_fsl.c -@@ -60,9 +60,18 @@ int fsl8250_handle_irq(struct uart_port *port) +@@ -60,9 +60,18 @@ int fsl8250_handle_irq(struct uart_port /* Stop processing interrupts on input overrun */ if ((orig_lsr & UART_LSR_OE) && (up->overrun_backoff_time_ms > 0)) { -+ unsigned int ca_flags; ++ unsigned long flags; unsigned long delay; + bool is_console; + is_console = uart_console(port); + + if (is_console) -+ console_atomic_lock(&ca_flags); ++ console_atomic_lock(flags); up->ier = port->serial_in(port, UART_IER); + if (is_console) -+ console_atomic_unlock(ca_flags); ++ console_atomic_unlock(flags); + if (up->ier & (UART_IER_RLSI | UART_IER_RDI)) { port->ops->stop_rx(port); } else { -diff --git a/drivers/tty/serial/8250/8250_ingenic.c b/drivers/tty/serial/8250/8250_ingenic.c -index 988bf6bcce42..bcd26d672539 100644 --- a/drivers/tty/serial/8250/8250_ingenic.c +++ b/drivers/tty/serial/8250/8250_ingenic.c -@@ -146,6 +146,8 @@ OF_EARLYCON_DECLARE(x1000_uart, "ingenic,x1000-uart", +@@ -146,6 +146,8 @@ OF_EARLYCON_DECLARE(x1000_uart, "ingenic static void ingenic_uart_serial_out(struct uart_port *p, int offset, int value) { -+ unsigned int flags; ++ unsigned long flags; + bool is_console; int ier; switch (offset) { -@@ -167,7 +169,12 @@ static void ingenic_uart_serial_out(struct uart_port *p, int offset, int value) +@@ -167,7 +169,12 @@ static void ingenic_uart_serial_out(stru * If we have enabled modem status IRQs we should enable * modem mode. */ + is_console = uart_console(p); + if (is_console) -+ console_atomic_lock(&flags); ++ console_atomic_lock(flags); ier = p->serial_in(p, UART_IER); + if (is_console) + console_atomic_unlock(flags); if (ier & UART_IER_MSI) value |= UART_MCR_MDCE | UART_MCR_FCM; -diff --git a/drivers/tty/serial/8250/8250_mtk.c b/drivers/tty/serial/8250/8250_mtk.c -index f7d3023f860f..8133713dcf5e 100644 --- a/drivers/tty/serial/8250/8250_mtk.c +++ b/drivers/tty/serial/8250/8250_mtk.c -@@ -213,12 +213,37 @@ static void mtk8250_shutdown(struct uart_port *port) +@@ -218,12 +218,37 @@ static void mtk8250_shutdown(struct uart static void mtk8250_disable_intrs(struct uart_8250_port *up, int mask) { - serial_out(up, UART_IER, serial_in(up, UART_IER) & (~mask)); + struct uart_port *port = &up->port; -+ unsigned int flags; ++ unsigned long flags; + unsigned int ier; + bool is_console; + + is_console = uart_console(port); + + if (is_console) -+ console_atomic_lock(&flags); ++ console_atomic_lock(flags); + + ier = serial_in(up, UART_IER); + serial_out(up, UART_IER, ier & (~mask)); @@ -224,11 +216,11 @@ index f7d3023f860f..8133713dcf5e 100644 { - serial_out(up, UART_IER, serial_in(up, UART_IER) | mask); + struct uart_port *port = &up->port; -+ unsigned int flags; ++ unsigned long flags; + unsigned int ier; + + if (uart_console(port)) -+ console_atomic_lock(&flags); ++ console_atomic_lock(flags); + + ier = serial_in(up, UART_IER); + serial_out(up, UART_IER, ier | mask); @@ -238,11 +230,9 @@ index f7d3023f860f..8133713dcf5e 100644 } static void mtk8250_set_flow_ctrl(struct uart_8250_port *up, int mode) -diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c -index b0af13074cd3..b05f8c34b291 100644 --- a/drivers/tty/serial/8250/8250_port.c +++ b/drivers/tty/serial/8250/8250_port.c -@@ -757,7 +757,7 @@ static void serial8250_set_sleep(struct uart_8250_port *p, int sleep) +@@ -762,7 +762,7 @@ static void serial8250_set_sleep(struct serial_out(p, UART_EFR, UART_EFR_ECB); serial_out(p, UART_LCR, 0); } @@ -251,7 +241,7 @@ index b0af13074cd3..b05f8c34b291 100644 if (p->capabilities & UART_CAP_EFR) { serial_out(p, UART_LCR, UART_LCR_CONF_MODE_B); serial_out(p, UART_EFR, efr); -@@ -1429,7 +1429,7 @@ static void serial8250_stop_rx(struct uart_port *port) +@@ -1436,7 +1436,7 @@ static void serial8250_stop_rx(struct ua up->ier &= ~(UART_IER_RLSI | UART_IER_RDI); up->port.read_status_mask &= ~UART_LSR_DR; @@ -260,7 +250,7 @@ index b0af13074cd3..b05f8c34b291 100644 serial8250_rpm_put(up); } -@@ -1459,7 +1459,7 @@ void serial8250_em485_stop_tx(struct uart_8250_port *p) +@@ -1466,7 +1466,7 @@ void serial8250_em485_stop_tx(struct uar serial8250_clear_and_reinit_fifos(p); p->ier |= UART_IER_RLSI | UART_IER_RDI; @@ -269,7 +259,7 @@ index b0af13074cd3..b05f8c34b291 100644 } } EXPORT_SYMBOL_GPL(serial8250_em485_stop_tx); -@@ -1687,7 +1687,7 @@ static void serial8250_disable_ms(struct uart_port *port) +@@ -1688,7 +1688,7 @@ static void serial8250_disable_ms(struct mctrl_gpio_disable_ms(up->gpios); up->ier &= ~UART_IER_MSI; @@ -278,7 +268,7 @@ index b0af13074cd3..b05f8c34b291 100644 } static void serial8250_enable_ms(struct uart_port *port) -@@ -1703,7 +1703,7 @@ static void serial8250_enable_ms(struct uart_port *port) +@@ -1704,7 +1704,7 @@ static void serial8250_enable_ms(struct up->ier |= UART_IER_MSI; serial8250_rpm_get(up); @@ -287,7 +277,7 @@ index b0af13074cd3..b05f8c34b291 100644 serial8250_rpm_put(up); } -@@ -2118,14 +2118,7 @@ static void serial8250_put_poll_char(struct uart_port *port, +@@ -2132,14 +2132,7 @@ static void serial8250_put_poll_char(str struct uart_8250_port *up = up_to_u8250p(port); serial8250_rpm_get(up); @@ -303,7 +293,7 @@ index b0af13074cd3..b05f8c34b291 100644 wait_for_xmitr(up, BOTH_EMPTY); /* -@@ -2138,7 +2131,7 @@ static void serial8250_put_poll_char(struct uart_port *port, +@@ -2152,7 +2145,7 @@ static void serial8250_put_poll_char(str * and restore the IER */ wait_for_xmitr(up, BOTH_EMPTY); @@ -312,7 +302,7 @@ index b0af13074cd3..b05f8c34b291 100644 serial8250_rpm_put(up); } -@@ -2441,7 +2434,7 @@ void serial8250_do_shutdown(struct uart_port *port) +@@ -2455,7 +2448,7 @@ void serial8250_do_shutdown(struct uart_ */ spin_lock_irqsave(&port->lock, flags); up->ier = 0; @@ -321,7 +311,7 @@ index b0af13074cd3..b05f8c34b291 100644 spin_unlock_irqrestore(&port->lock, flags); synchronize_irq(port->irq); -@@ -2771,7 +2764,7 @@ serial8250_do_set_termios(struct uart_port *port, struct ktermios *termios, +@@ -2824,7 +2817,7 @@ serial8250_do_set_termios(struct uart_po if (up->capabilities & UART_CAP_RTOIE) up->ier |= UART_IER_RTOIE; @@ -330,7 +320,7 @@ index b0af13074cd3..b05f8c34b291 100644 if (up->capabilities & UART_CAP_EFR) { unsigned char efr = 0; -@@ -3237,7 +3230,7 @@ EXPORT_SYMBOL_GPL(serial8250_set_defaults); +@@ -3290,7 +3283,7 @@ EXPORT_SYMBOL_GPL(serial8250_set_default #ifdef CONFIG_SERIAL_8250_CONSOLE @@ -339,18 +329,18 @@ index b0af13074cd3..b05f8c34b291 100644 { struct uart_8250_port *up = up_to_u8250p(port); -@@ -3245,6 +3238,18 @@ static void serial8250_console_putchar(struct uart_port *port, int ch) +@@ -3298,6 +3291,18 @@ static void serial8250_console_putchar(s serial_port_out(port, UART_TX, ch); } +static void serial8250_console_putchar(struct uart_port *port, int ch) +{ + struct uart_8250_port *up = up_to_u8250p(port); -+ unsigned int flags; ++ unsigned long flags; + + wait_for_xmitr(up, UART_LSR_THRE); + -+ console_atomic_lock(&flags); ++ console_atomic_lock(flags); + serial8250_console_putchar_locked(port, ch); + console_atomic_unlock(flags); +} @@ -358,7 +348,7 @@ index b0af13074cd3..b05f8c34b291 100644 /* * Restore serial console when h/w power-off detected */ -@@ -3266,6 +3271,32 @@ static void serial8250_console_restore(struct uart_8250_port *up) +@@ -3319,6 +3324,32 @@ static void serial8250_console_restore(s serial8250_out_MCR(up, UART_MCR_DTR | UART_MCR_RTS); } @@ -366,10 +356,10 @@ index b0af13074cd3..b05f8c34b291 100644 + const char *s, unsigned int count) +{ + struct uart_port *port = &up->port; -+ unsigned int flags; ++ unsigned long flags; + unsigned int ier; + -+ console_atomic_lock(&flags); ++ console_atomic_lock(flags); + + touch_nmi_watchdog(); + @@ -391,7 +381,7 @@ index b0af13074cd3..b05f8c34b291 100644 /* * Print a string to the serial port trying not to disturb * any possible real use of the port... -@@ -3282,24 +3313,12 @@ void serial8250_console_write(struct uart_8250_port *up, const char *s, +@@ -3335,24 +3366,12 @@ void serial8250_console_write(struct uar struct uart_port *port = &up->port; unsigned long flags; unsigned int ier; @@ -418,7 +408,7 @@ index b0af13074cd3..b05f8c34b291 100644 /* check scratch reg to see if port powered off during system sleep */ if (up->canary && (up->canary != serial_port_in(port, UART_SCR))) { -@@ -3313,7 +3332,9 @@ void serial8250_console_write(struct uart_8250_port *up, const char *s, +@@ -3366,7 +3385,9 @@ void serial8250_console_write(struct uar mdelay(port->rs485.delay_rts_before_send); } @@ -428,7 +418,7 @@ index b0af13074cd3..b05f8c34b291 100644 /* * Finally, wait for transmitter to become empty -@@ -3326,8 +3347,7 @@ void serial8250_console_write(struct uart_8250_port *up, const char *s, +@@ -3379,8 +3400,7 @@ void serial8250_console_write(struct uar if (em485->tx_stopped) up->rs485_stop_tx(up); } @@ -438,7 +428,7 @@ index b0af13074cd3..b05f8c34b291 100644 /* * The receive handling will happen properly because the -@@ -3339,8 +3359,7 @@ void serial8250_console_write(struct uart_8250_port *up, const char *s, +@@ -3392,8 +3412,7 @@ void serial8250_console_write(struct uar if (up->msr_saved_flags) serial8250_modem_status(up); @@ -448,7 +438,7 @@ index b0af13074cd3..b05f8c34b291 100644 } static unsigned int probe_baud(struct uart_port *port) -@@ -3360,6 +3379,7 @@ static unsigned int probe_baud(struct uart_port *port) +@@ -3413,6 +3432,7 @@ static unsigned int probe_baud(struct ua int serial8250_console_setup(struct uart_port *port, char *options, bool probe) { @@ -456,7 +446,7 @@ index b0af13074cd3..b05f8c34b291 100644 int baud = 9600; int bits = 8; int parity = 'n'; -@@ -3369,6 +3389,8 @@ int serial8250_console_setup(struct uart_port *port, char *options, bool probe) +@@ -3422,6 +3442,8 @@ int serial8250_console_setup(struct uart if (!port->iobase && !port->membase) return -ENODEV; @@ -465,8 +455,6 @@ index b0af13074cd3..b05f8c34b291 100644 if (options) uart_parse_options(options, &baud, &parity, &bits, &flow); else if (probe) -diff --git a/include/linux/serial_8250.h b/include/linux/serial_8250.h -index 2b70f736b091..68d756373b53 100644 --- a/include/linux/serial_8250.h +++ b/include/linux/serial_8250.h @@ -7,6 +7,7 @@ @@ -486,7 +474,7 @@ index 2b70f736b091..68d756373b53 100644 struct uart_8250_dma *dma; const struct uart_8250_ops *ops; -@@ -180,6 +183,8 @@ void serial8250_init_port(struct uart_8250_port *up); +@@ -180,6 +183,8 @@ void serial8250_init_port(struct uart_82 void serial8250_set_defaults(struct uart_8250_port *up); void serial8250_console_write(struct uart_8250_port *up, const char *s, unsigned int count); @@ -495,6 +483,3 @@ index 2b70f736b091..68d756373b53 100644 int serial8250_console_setup(struct uart_port *port, char *options, bool probe); int serial8250_console_exit(struct uart_port *port); --- -2.30.2 - diff --git a/debian/patches-rt/series b/debian/patches-rt/series index 42b36bb12..cf97ca82b 100644 --- a/debian/patches-rt/series +++ b/debian/patches-rt/series @@ -1,296 +1,271 @@ -0001-z3fold-remove-preempt-disabled-sections-for-RT.patch -0002-stop_machine-Add-function-and-caller-debug-info.patch -0003-sched-Fix-balance_callback.patch -0004-sched-hotplug-Ensure-only-per-cpu-kthreads-run-durin.patch -0005-sched-core-Wait-for-tasks-being-pushed-away-on-hotpl.patch -0006-workqueue-Manually-break-affinity-on-hotplug.patch -0007-sched-hotplug-Consolidate-task-migration-on-CPU-unpl.patch -0008-sched-Fix-hotplug-vs-CPU-bandwidth-control.patch -0009-sched-Massage-set_cpus_allowed.patch -0010-sched-Add-migrate_disable.patch -0011-sched-Fix-migrate_disable-vs-set_cpus_allowed_ptr.patch -0012-sched-core-Make-migrate-disable-and-CPU-hotplug-coop.patch -0013-sched-rt-Use-cpumask_any-_distribute.patch -0014-sched-rt-Use-the-full-cpumask-for-balancing.patch -0015-sched-lockdep-Annotate-pi_lock-recursion.patch -0016-sched-Fix-migrate_disable-vs-rt-dl-balancing.patch -0017-sched-proc-Print-accurate-cpumask-vs-migrate_disable.patch -0018-sched-Add-migrate_disable-tracepoints.patch -0019-sched-Deny-self-issued-__set_cpus_allowed_ptr-when-m.patch -0020-sched-Comment-affine_move_task.patch -0021-sched-Unlock-the-rq-in-affine_move_task-error-path.patch -0022-sched-Fix-migration_cpu_stop-WARN.patch -0023-sched-core-Add-missing-completion-for-affine_move_ta.patch -0024-mm-highmem-Un-EXPORT-__kmap_atomic_idx.patch -0025-highmem-Remove-unused-functions.patch -0026-fs-Remove-asm-kmap_types.h-includes.patch -0027-sh-highmem-Remove-all-traces-of-unused-cruft.patch -0028-asm-generic-Provide-kmap_size.h.patch -0029-highmem-Provide-generic-variant-of-kmap_atomic.patch -0030-highmem-Make-DEBUG_HIGHMEM-functional.patch -0031-x86-mm-highmem-Use-generic-kmap-atomic-implementatio.patch -0032-arc-mm-highmem-Use-generic-kmap-atomic-implementatio.patch -0033-ARM-highmem-Switch-to-generic-kmap-atomic.patch -0034-csky-mm-highmem-Switch-to-generic-kmap-atomic.patch -0035-microblaze-mm-highmem-Switch-to-generic-kmap-atomic.patch -0036-mips-mm-highmem-Switch-to-generic-kmap-atomic.patch -0037-nds32-mm-highmem-Switch-to-generic-kmap-atomic.patch -0038-powerpc-mm-highmem-Switch-to-generic-kmap-atomic.patch -0039-sparc-mm-highmem-Switch-to-generic-kmap-atomic.patch -0040-xtensa-mm-highmem-Switch-to-generic-kmap-atomic.patch -0041-highmem-Get-rid-of-kmap_types.h.patch -0042-mm-highmem-Remove-the-old-kmap_atomic-cruft.patch -0043-io-mapping-Cleanup-atomic-iomap.patch -0044-Documentation-io-mapping-Remove-outdated-blurb.patch -0045-highmem-High-implementation-details-and-document-API.patch -0046-sched-Make-migrate_disable-enable-independent-of-RT.patch -0047-sched-highmem-Store-local-kmaps-in-task-struct.patch -0048-mm-highmem-Provide-kmap_local.patch -0049-io-mapping-Provide-iomap_local-variant.patch -0050-x86-crashdump-32-Simplify-copy_oldmem_page.patch -0051-mips-crashdump-Simplify-copy_oldmem_page.patch -0052-ARM-mm-Replace-kmap_atomic_pfn.patch -0053-highmem-Remove-kmap_atomic_pfn.patch -0054-drm-ttm-Replace-kmap_atomic-usage.patch -0055-drm-vmgfx-Replace-kmap_atomic.patch -0056-highmem-Remove-kmap_atomic_prot.patch -0057-drm-qxl-Replace-io_mapping_map_atomic_wc.patch -0058-drm-nouveau-device-Replace-io_mapping_map_atomic_wc.patch -0059-drm-i915-Replace-io_mapping_map_atomic_wc.patch -0060-io-mapping-Remove-io_mapping_map_atomic_wc.patch -0061-mm-highmem-Take-kmap_high_get-properly-into-account.patch -0062-highmem-Don-t-disable-preemption-on-RT-in-kmap_atomi.patch -0063-timers-Move-clearing-of-base-timer_running-under-bas.patch -0064-blk-mq-Don-t-complete-on-a-remote-CPU-in-force-threa.patch -0065-blk-mq-Always-complete-remote-completions-requests-i.patch -0066-blk-mq-Use-llist_head-for-blk_cpu_done.patch -0067-lib-test_lockup-Minimum-fix-to-get-it-compiled-on-PR.patch -0068-timers-Don-t-block-on-expiry_lock-for-TIMER_IRQSAFE.patch -0069-kthread-Move-prio-affinite-change-into-the-newly-cre.patch -0070-genirq-Move-prio-assignment-into-the-newly-created-t.patch -0071-notifier-Make-atomic_notifiers-use-raw_spinlock.patch -0072-rcu-Make-RCU_BOOST-default-on-CONFIG_PREEMPT_RT.patch -0073-rcu-Unconditionally-use-rcuc-threads-on-PREEMPT_RT.patch -0074-rcu-Enable-rcu_normal_after_boot-unconditionally-for.patch -0075-doc-Update-RCU-s-requirements-page-about-the-PREEMPT.patch -0076-doc-Use-CONFIG_PREEMPTION.patch -0077-tracing-Merge-irqflags-preempt-counter.patch -0078-tracing-Inline-tracing_gen_ctx_flags.patch -0079-tracing-Use-in_serving_softirq-to-deduct-softirq-sta.patch -0080-tracing-Remove-NULL-check-from-current-in-tracing_ge.patch -0081-printk-inline-log_output-log_store-in-vprintk_store.patch -0082-printk-remove-logbuf_lock-writer-protection-of-ringb.patch -0083-printk-limit-second-loop-of-syslog_print_all.patch -0084-printk-kmsg_dump-remove-unused-fields.patch -0085-printk-refactor-kmsg_dump_get_buffer.patch -0086-printk-consolidate-kmsg_dump_get_buffer-syslog_print.patch -0087-printk-introduce-CONSOLE_LOG_MAX-for-improved-multi-.patch -0088-printk-use-seqcount_latch-for-clear_seq.patch -0089-printk-use-atomic64_t-for-devkmsg_user.seq.patch -0090-printk-add-syslog_lock.patch -0091-printk-introduce-a-kmsg_dump-iterator.patch -0092-um-synchronize-kmsg_dumper.patch -0093-printk-remove-logbuf_lock.patch -0094-printk-kmsg_dump-remove-_nolock-variants.patch -0095-printk-kmsg_dump-use-kmsg_dump_rewind.patch -0096-printk-console-remove-unnecessary-safe-buffer-usage.patch -0097-printk-track-limit-recursion.patch -0098-printk-remove-safe-buffers.patch -0099-printk-convert-syslog_lock-to-spin_lock.patch -0100-console-add-write_atomic-interface.patch -0101-serial-8250-implement-write_atomic.patch -0102-printk-relocate-printk_delay-and-vprintk_default.patch -0103-printk-combine-boot_delay_msec-into-printk_delay.patch -0104-printk-change-console_seq-to-atomic64_t.patch -0105-printk-introduce-kernel-sync-mode.patch -0106-printk-move-console-printing-to-kthreads.patch -0107-printk-remove-deferred-printing.patch -0108-printk-add-console-handover.patch -0109-printk-add-pr_flush.patch -0110-cgroup-use-irqsave-in-cgroup_rstat_flush_locked.patch -0111-mm-workingset-replace-IRQ-off-check-with-a-lockdep-a.patch -0112-tpm-remove-tpm_dev_wq_lock.patch -0113-shmem-Use-raw_spinlock_t-for-stat_lock.patch -0114-net-Move-lockdep-where-it-belongs.patch -0115-tcp-Remove-superfluous-BH-disable-around-listening_h.patch -0116-parisc-Remove-bogus-__IRQ_STAT-macro.patch -0117-sh-Get-rid-of-nmi_count.patch -0118-irqstat-Get-rid-of-nmi_count-and-__IRQ_STAT.patch -0119-um-irqstat-Get-rid-of-the-duplicated-declarations.patch -0120-ARM-irqstat-Get-rid-of-duplicated-declaration.patch -0121-arm64-irqstat-Get-rid-of-duplicated-declaration.patch -0122-asm-generic-irqstat-Add-optional-__nmi_count-member.patch -0123-sh-irqstat-Use-the-generic-irq_cpustat_t.patch -0124-irqstat-Move-declaration-into-asm-generic-hardirq.h.patch -0125-preempt-Cleanup-the-macro-maze-a-bit.patch -0126-softirq-Move-related-code-into-one-section.patch -0127-sh-irq-Add-missing-closing-parentheses-in-arch_show_.patch -0128-sched-cputime-Remove-symbol-exports-from-IRQ-time-ac.patch -0129-s390-vtime-Use-the-generic-IRQ-entry-accounting.patch -0130-sched-vtime-Consolidate-IRQ-time-accounting.patch -0131-irqtime-Move-irqtime-entry-accounting-after-irq-offs.patch -0132-irq-Call-tick_irq_enter-inside-HARDIRQ_OFFSET.patch -0133-smp-Wake-ksoftirqd-on-PREEMPT_RT-instead-do_softirq.patch -0134-net-arcnet-Fix-RESET-flag-handling.patch -0135-tasklets-Replace-barrier-with-cpu_relax-in-tasklet_u.patch -0136-tasklets-Use-static-inlines-for-stub-implementations.patch -0137-tasklets-Provide-tasklet_disable_in_atomic.patch -0138-tasklets-Use-spin-wait-in-tasklet_disable-temporaril.patch -0139-tasklets-Replace-spin-wait-in-tasklet_unlock_wait.patch -0140-tasklets-Replace-spin-wait-in-tasklet_kill.patch -0141-tasklets-Prevent-tasklet_unlock_spin_wait-deadlock-o.patch -0142-net-jme-Replace-link-change-tasklet-with-work.patch -0143-net-sundance-Use-tasklet_disable_in_atomic.patch -0144-ath9k-Use-tasklet_disable_in_atomic.patch -0145-atm-eni-Use-tasklet_disable_in_atomic-in-the-send-ca.patch -0146-PCI-hv-Use-tasklet_disable_in_atomic.patch -0147-firewire-ohci-Use-tasklet_disable_in_atomic-where-re.patch -0148-tasklets-Switch-tasklet_disable-to-the-sleep-wait-va.patch -0149-softirq-Add-RT-specific-softirq-accounting.patch -0150-irqtime-Make-accounting-correct-on-RT.patch -0151-softirq-Move-various-protections-into-inline-helpers.patch -0152-softirq-Make-softirq-control-and-processing-RT-aware.patch -0153-tick-sched-Prevent-false-positive-softirq-pending-wa.patch -0154-rcu-Prevent-false-positive-softirq-warning-on-RT.patch -0155-chelsio-cxgb-Replace-the-workqueue-with-threaded-int.patch -0156-chelsio-cxgb-Disable-the-card-on-error-in-threaded-i.patch -0157-x86-fpu-Simplify-fpregs_-un-lock.patch -0158-x86-fpu-Make-kernel-FPU-protection-RT-friendly.patch -0159-locking-rtmutex-Remove-cruft.patch -0160-locking-rtmutex-Remove-output-from-deadlock-detector.patch -0161-locking-rtmutex-Move-rt_mutex_init-outside-of-CONFIG.patch -0162-locking-rtmutex-Remove-rt_mutex_timed_lock.patch -0163-locking-rtmutex-Handle-the-various-new-futex-race-co.patch -0164-futex-Fix-bug-on-when-a-requeued-RT-task-times-out.patch -0165-locking-rtmutex-Make-lock_killable-work.patch -0166-locking-spinlock-Split-the-lock-types-header.patch -0167-locking-rtmutex-Avoid-include-hell.patch -0168-lockdep-Reduce-header-files-in-debug_locks.h.patch -0169-locking-split-out-the-rbtree-definition.patch -0170-locking-rtmutex-Provide-rt_mutex_slowlock_locked.patch -0171-locking-rtmutex-export-lockdep-less-version-of-rt_mu.patch -0172-sched-Add-saved_state-for-tasks-blocked-on-sleeping-.patch -0173-locking-rtmutex-add-sleeping-lock-implementation.patch -0174-locking-rtmutex-Allow-rt_mutex_trylock-on-PREEMPT_RT.patch -0175-locking-rtmutex-add-mutex-implementation-based-on-rt.patch -0176-locking-rtmutex-add-rwsem-implementation-based-on-rt.patch -0177-locking-rtmutex-add-rwlock-implementation-based-on-r.patch -0178-locking-rtmutex-wire-up-RT-s-locking.patch -0179-locking-rtmutex-add-ww_mutex-addon-for-mutex-rt.patch -0180-locking-rtmutex-Use-custom-scheduling-function-for-s.patch -0181-signal-Revert-ptrace-preempt-magic.patch -0182-preempt-Provide-preempt_-_-no-rt-variants.patch -0183-mm-vmstat-Protect-per-cpu-variables-with-preempt-dis.patch -0184-mm-memcontrol-Disable-preemption-in-__mod_memcg_lruv.patch -0185-xfrm-Use-sequence-counter-with-associated-spinlock.patch -0186-u64_stats-Disable-preemption-on-32bit-UP-SMP-with-RT.patch -0187-fs-dcache-use-swait_queue-instead-of-waitqueue.patch -0188-fs-dcache-disable-preemption-on-i_dir_seq-s-write-si.patch -0189-net-Qdisc-use-a-seqlock-instead-seqcount.patch -0190-net-Properly-annotate-the-try-lock-for-the-seqlock.patch -0191-kconfig-Disable-config-options-which-are-not-RT-comp.patch -0192-mm-Allow-only-SLUB-on-RT.patch -0193-sched-Disable-CONFIG_RT_GROUP_SCHED-on-RT.patch -0194-net-core-disable-NET_RX_BUSY_POLL-on-RT.patch -0195-efi-Disable-runtime-services-on-RT.patch -0196-efi-Allow-efi-runtime.patch -0197-rt-Add-local-irq-locks.patch -0198-signal-x86-Delay-calling-signals-in-atomic.patch -0199-Split-IRQ-off-and-zone-lock-while-freeing-pages-from.patch -0200-Split-IRQ-off-and-zone-lock-while-freeing-pages-from.patch -0201-mm-SLxB-change-list_lock-to-raw_spinlock_t.patch -0202-mm-SLUB-delay-giving-back-empty-slubs-to-IRQ-enabled.patch -0203-mm-slub-Always-flush-the-delayed-empty-slubs-in-flus.patch -0204-mm-slub-Don-t-resize-the-location-tracking-cache-on-.patch -0205-mm-page_alloc-Use-migrate_disable-in-drain_local_pag.patch -0206-mm-page_alloc-rt-friendly-per-cpu-pages.patch -0207-mm-slub-Make-object_map_lock-a-raw_spinlock_t.patch -0208-slub-Enable-irqs-for-__GFP_WAIT.patch -0209-slub-Disable-SLUB_CPU_PARTIAL.patch -0210-mm-memcontrol-Provide-a-local_lock-for-per-CPU-memcg.patch -0211-mm-memcontrol-Don-t-call-schedule_work_on-in-preempt.patch -0212-mm-memcontrol-Replace-local_irq_disable-with-local-l.patch -0213-mm-zsmalloc-copy-with-get_cpu_var-and-locking.patch -0214-mm-zswap-Use-local-lock-to-protect-per-CPU-data.patch -0215-x86-kvm-Require-const-tsc-for-RT.patch -0216-wait.h-include-atomic.h.patch -0217-sched-Limit-the-number-of-task-migrations-per-batch.patch -0218-sched-Move-mmdrop-to-RCU-on-RT.patch -0219-kernel-sched-move-stack-kprobe-clean-up-to-__put_tas.patch -0220-sched-Do-not-account-rcu_preempt_depth-on-RT-in-migh.patch -0221-sched-Disable-TTWU_QUEUE-on-RT.patch -0222-softirq-Check-preemption-after-reenabling-interrupts.patch -0223-softirq-Disable-softirq-stacks-for-RT.patch -0224-net-core-use-local_bh_disable-in-netif_rx_ni.patch -0225-pid.h-include-atomic.h.patch -0226-ptrace-fix-ptrace-vs-tasklist_lock-race.patch -0227-ptrace-fix-ptrace_unfreeze_traced-race-with-rt-lock.patch -0228-kernel-sched-add-put-get-_cpu_light.patch -0229-trace-Add-migrate-disabled-counter-to-tracing-output.patch -0230-locking-don-t-check-for-__LINUX_SPINLOCK_TYPES_H-on-.patch -0231-locking-Make-spinlock_t-and-rwlock_t-a-RCU-section-o.patch -0232-rcutorture-Avoid-problematic-critical-section-nestin.patch -0233-mm-vmalloc-Another-preempt-disable-region-which-suck.patch -0234-block-mq-do-not-invoke-preempt_disable.patch -0235-md-raid5-Make-raid5_percpu-handling-RT-aware.patch -0236-scsi-fcoe-Make-RT-aware.patch -0237-sunrpc-Make-svc_xprt_do_enqueue-use-get_cpu_light.patch -0238-rt-Introduce-cpu_chill.patch -0239-fs-namespace-Use-cpu_chill-in-trylock-loops.patch -0240-debugobjects-Make-RT-aware.patch -0241-net-Use-skbufhead-with-raw-lock.patch -0242-net-Dequeue-in-dev_cpu_dead-without-the-lock.patch -0243-net-dev-always-take-qdisc-s-busylock-in-__dev_xmit_s.patch -0244-irqwork-push-most-work-into-softirq-context.patch -0245-x86-crypto-Reduce-preempt-disabled-regions.patch -0246-crypto-Reduce-preempt-disabled-regions-more-algos.patch -0247-crypto-limit-more-FPU-enabled-sections.patch -0248-crypto-cryptd-add-a-lock-instead-preempt_disable-loc.patch -0249-panic-skip-get_random_bytes-for-RT_FULL-in-init_oops.patch -0250-x86-stackprotector-Avoid-random-pool-on-rt.patch -0251-random-Make-it-work-on-rt.patch -0252-net-Remove-preemption-disabling-in-netif_rx.patch -0253-lockdep-Make-it-RT-aware.patch -0254-lockdep-selftest-Only-do-hardirq-context-test-for-ra.patch -0255-lockdep-selftest-fix-warnings-due-to-missing-PREEMPT.patch -0256-lockdep-disable-self-test.patch -0257-drm-radeon-i915-Use-preempt_disable-enable_rt-where-.patch -0258-drm-i915-Don-t-disable-interrupts-on-PREEMPT_RT-duri.patch -0259-drm-i915-disable-tracing-on-RT.patch -0260-drm-i915-skip-DRM_I915_LOW_LEVEL_TRACEPOINTS-with-NO.patch -0261-drm-i915-gt-Only-disable-interrupts-for-the-timeline.patch -0262-cpuset-Convert-callback_lock-to-raw_spinlock_t.patch -0263-x86-Allow-to-enable-RT.patch -0264-mm-scatterlist-Do-not-disable-irqs-on-RT.patch -0265-sched-Add-support-for-lazy-preemption.patch -0266-x86-entry-Use-should_resched-in-idtentry_exit_cond_r.patch -0267-x86-Support-for-lazy-preemption.patch -0268-arm-Add-support-for-lazy-preemption.patch -0269-powerpc-Add-support-for-lazy-preemption.patch -0270-arch-arm64-Add-lazy-preempt-support.patch -0271-jump-label-disable-if-stop_machine-is-used.patch -0272-leds-trigger-disable-CPU-trigger-on-RT.patch -0273-tty-serial-omap-Make-the-locking-RT-aware.patch -0274-tty-serial-pl011-Make-the-locking-work-on-RT.patch -0275-ARM-enable-irq-in-translation-section-permission-fau.patch -0276-genirq-update-irq_set_irqchip_state-documentation.patch -0277-KVM-arm-arm64-downgrade-preempt_disable-d-region-to-.patch -0278-arm64-fpsimd-Delay-freeing-memory-in-fpsimd_flush_th.patch -0279-x86-Enable-RT-also-on-32bit.patch -0280-ARM-Allow-to-enable-RT.patch -0281-ARM64-Allow-to-enable-RT.patch -0282-powerpc-traps-Use-PREEMPT_RT.patch -0283-powerpc-pseries-iommu-Use-a-locallock-instead-local_.patch -0284-powerpc-kvm-Disable-in-kernel-MPIC-emulation-for-PRE.patch -0285-powerpc-stackprotector-work-around-stack-guard-init-.patch -0286-powerpc-Avoid-recursive-header-includes.patch -0287-POWERPC-Allow-to-enable-RT.patch -0288-drivers-block-zram-Replace-bit-spinlocks-with-rtmute.patch -0289-tpm_tis-fix-stall-after-iowrite-s.patch -0290-signals-Allow-rt-tasks-to-cache-one-sigqueue-struct.patch -0291-signal-Prevent-double-free-of-user-struct.patch -0292-genirq-Disable-irqpoll-on-rt.patch -0293-sysfs-Add-sys-kernel-realtime-entry.patch -0294-Add-localversion-for-RT-release.patch -0295-net-xfrm-Use-sequence-counter-with-associated-spinlo.patch -0296-Linux-5.10.35-rt39-REBASE.patch +# Applied upstream + +########################################################################### +# Valentin's PCP fixes +########################################################################### +# Temp RCU patch, Frederick is working on something, too. +rcu-tree-Protect-rcu_rdp_is_offloaded-invocations-on.patch +sched_introduce_migratable.patch +arm64_mm_make_arch_faults_on_old_pte_check_for_migratability.patch + +########################################################################### +# John's printk queue +########################################################################### +printk__rename_printk_cpulock_API_and_always_disable_interrupts.patch +console__add_write_atomic_interface.patch +kdb__only_use_atomic_consoles_for_output_mirroring.patch +serial__8250__implement_write_atomic.patch +printk__relocate_printk_delay.patch +printk__call_boot_delay_msec_in_printk_delay.patch +printk__use_seqcount_latch_for_console_seq.patch +printk__introduce_kernel_sync_mode.patch +printk__move_console_printing_to_kthreads.patch +printk__remove_deferred_printing.patch +printk__add_console_handover.patch +printk__add_pr_flush.patch +printk__Enhance_the_condition_check_of_msleep_in_pr_flush.patch + +########################################################################### +# Posted and applied +########################################################################### +sched-Switch-wait_task_inactive-to-HRTIMER_MODE_REL_.patch +kthread-Move-prio-affinite-change-into-the-newly-cre.patch +genirq-Move-prio-assignment-into-the-newly-created-t.patch +genirq-Disable-irqfixup-poll-on-PREEMPT_RT.patch +efi-Disable-runtime-services-on-RT.patch +efi-Allow-efi-runtime.patch +mm-Disable-zsmalloc-on-PREEMPT_RT.patch +net-core-disable-NET_RX_BUSY_POLL-on-PREEMPT_RT.patch +samples_kfifo__Rename_read_lock_write_lock.patch +crypto_testmgr_only_disable_migration_in_crypto_disable_simd_for_test.patch +mm_allow_only_slub_on_preempt_rt.patch +mm_page_alloc_use_migrate_disable_in_drain_local_pages_wq.patch +mm_scatterlist_replace_the_preemptible_warning_in_sg_miter_stop.patch +mm-Disable-NUMA_BALANCING_DEFAULT_ENABLED-and-TRANSP.patch +x86-softirq-Disable-softirq-stacks-on-PREEMPT_RT.patch + +# KCOV (akpm) +0001_documentation_kcov_include_types_h_in_the_example.patch +0002_documentation_kcov_define_ip_in_the_example.patch +0003_kcov_allocate_per_cpu_memory_on_the_relevant_node.patch +0004_kcov_avoid_enable_disable_interrupts_if_in_task.patch +0005_kcov_replace_local_irq_save_with_a_local_lock_t.patch + +# net-next, Qdics's seqcount removal. +net-sched-sch_ets-properly-init-all-active-DRR-list-.patch +0001-gen_stats-Add-instead-Set-the-value-in-__gnet_stats_.patch +0002-gen_stats-Add-gnet_stats_add_queue.patch +0003-mq-mqprio-Use-gnet_stats_add_queue.patch +0004-gen_stats-Move-remaining-users-to-gnet_stats_add_que.patch +0005-u64_stats-Introduce-u64_stats_set.patch +0006-net-sched-Protect-Qdisc-bstats-with-u64_stats.patch +0007-net-sched-Use-_bstats_update-set-instead-of-raw-writ.patch +0008-net-sched-Merge-Qdisc-bstats-and-Qdisc-cpu_bstats-da.patch +0009-net-sched-Remove-Qdisc-running-sequence-counter.patch +net-sched-Allow-statistics-reads-from-softirq.patch +net-sched-fix-logic-error-in-qdisc_run_begin.patch +net-sched-remove-one-pair-of-atomic-operations.patch +net-stats-Read-the-statistics-in-___gnet_stats_copy_.patch +net-sched-gred-dynamically-allocate-tc_gred_qopt_off.patch + +# tip, irqwork +0001_sched_rt_annotate_the_rt_balancing_logic_irqwork_as_irq_work_hard_irq.patch +0002_irq_work_allow_irq_work_sync_to_sleep_if_irq_work_no_irq_support.patch +0003_irq_work_handle_some_irq_work_in_a_per_cpu_thread_on_preempt_rt.patch +0004_irq_work_also_rcuwait_for_irq_work_hard_irq_on_preempt_rt.patch + +########################################################################### +# Posted +########################################################################### +irq_poll-Use-raise_softirq_irqoff-in-cpu_dead-notifi.patch +smp_wake_ksoftirqd_on_preempt_rt_instead_do_softirq.patch +fs-namespace-Boost-the-mount_lock.lock-owner-instead.patch +fscache-Use-only-one-fscache_object_cong_wait.patch + +# sched +0001_sched_clean_up_the_might_sleep_underscore_zoo.patch +0002_sched_make_cond_resched__lock_variants_consistent_vs_might_sleep.patch +0003_sched_remove_preempt_offset_argument_from___might_sleep.patch +0004_sched_cleanup_might_sleep_printks.patch +0005_sched_make_might_sleep_output_less_confusing.patch +0006_sched_make_rcu_nest_depth_distinct_in___might_resched.patch +0007_sched_make_cond_resched_lock_variants_rt_aware.patch +0008_locking_rt_take_rcu_nesting_into_account_for___might_resched.patch +# +0001_sched_limit_the_number_of_task_migrations_per_batch_on_rt.patch +0002_sched_disable_ttwu_queue_on_rt.patch +0003_sched_move_kprobes_cleanup_out_of_finish_task_switch.patch +0004_sched_delay_task_stack_freeing_on_rt.patch +0005_sched_move_mmdrop_to_rcu_on_rt.patch + +########################################################################### +# Post +########################################################################### +cgroup__use_irqsave_in_cgroup_rstat_flush_locked.patch +mm__workingset__replace_IRQ-off_check_with_a_lockdep_assert..patch +tcp__Remove_superfluous_BH-disable_around_listening_hash.patch + +########################################################################### +# Kconfig bits: +########################################################################### +jump-label__disable_if_stop_machine_is_used.patch + +########################################################################### +# Locking: RT bits. Need review +########################################################################### +locking-Remove-rt_rwlock_is_contended.patch +lockdep-selftests-Avoid-using-local_lock_-acquire-re.patch +0001-sched-Trigger-warning-if-migration_disabled-counter-.patch +0003-rtmutex-Add-a-special-case-for-ww-mutex-handling.patch +0004-rtmutex-Add-rt_mutex_lock_nest_lock-and-rt_mutex_loc.patch +0005-lockdep-Make-it-RT-aware.patch +0006-lockdep-selftests-Add-rtmutex-to-the-last-column.patch +0007-lockdep-selftests-Unbalanced-migrate_disable-rcu_rea.patch +0008-lockdep-selftests-Skip-the-softirq-related-tests-on-.patch +0010-lockdep-selftests-Adapt-ww-tests-for-PREEMPT_RT.patch +locking-Allow-to-include-asm-spinlock_types.h-from-l.patch + +########################################################################### +# preempt: Conditional variants +########################################################################### +sched-Make-preempt_enable_no_resched-behave-like-pre.patch + +########################################################################### +# sched: +########################################################################### +# cpu-light +kernel_sched__add_putget_cpu_light.patch +block_mq__do_not_invoke_preempt_disable.patch +md__raid5__Make_raid5_percpu_handling_RT_aware.patch +scsi_fcoe__Make_RT_aware..patch +mm_vmalloc__Another_preempt_disable_region_which_sucks.patch +net__Remove_preemption_disabling_in_netif_rx.patch +sunrpc__Make_svc_xprt_do_enqueue_use_get_cpu_light.patch +crypto__cryptd_-_add_a_lock_instead_preempt_disable_local_bh_disable.patch + +########################################################################### +# softirq: +########################################################################### +softirq__Check_preemption_after_reenabling_interrupts.patch + +########################################################################### +# mm: Assorted RT bits. Need care +########################################################################### +u64_stats__Disable_preemption_on_32bit-UP_SMP_with_RT_during_updates.patch + +########################################################################### +# Disable memcontrol for now. The protection scopes are FUBARed +########################################################################### +mm-memcontro--Disable-on-PREEMPT_RT.patch +#mm_memcontrol__Disable_preemption_in___mod_memcg_lruvec_state.patch +#mm__memcontrol__Replace_disable-IRQ_locking_with_a_local_lock.patch +#mm_memcontrol__Dont_call_schedule_work_on_in_preemption_disabled_context.patch +#mm_memcontrol__Replace_local_irq_disable_with_local_locks.patch + +########################################################################### +# ptrace: Revisit +########################################################################### +signal__Revert_ptrace_preempt_magic.patch +ptrace__fix_ptrace_vs_tasklist_lock_race.patch + +########################################################################### +# fs: The namespace part needs a proper fix +########################################################################### +fs_dcache__use_swait_queue_instead_of_waitqueue.patch +fs_dcache__disable_preemption_on_i_dir_seqs_write_side.patch + +########################################################################### +# RCU +########################################################################### +rcu__Delay_RCU-selftests.patch + +########################################################################### +# net: +########################################################################### +net_core__use_local_bh_disable_in_netif_rx_ni.patch +net__Use_skbufhead_with_raw_lock.patch +net__Dequeue_in_dev_cpu_dead_without_the_lock.patch +net__dev__always_take_qdiscs_busylock_in___dev_xmit_skb.patch + +########################################################################### +# randomness: +########################################################################### +panic__skip_get_random_bytes_for_RT_FULL_in_init_oops_id.patch +x86__stackprotector__Avoid_random_pool_on_rt.patch +random__Make_it_work_on_rt.patch + +########################################################################### +# DRM: +########################################################################### +0002-drm-i915-Don-t-disable-interrupts-and-pretend-a-lock.patch +0003-drm-i915-Use-preempt_disable-enable_rt-where-recomme.patch +0004-drm-i915-Don-t-disable-interrupts-on-PREEMPT_RT-duri.patch +0005-drm-i915-Don-t-check-for-atomic-context-on-PREEMPT_R.patch +0006-drm-i915-Disable-tracing-points-on-PREEMPT_RT.patch +0007-drm-i915-skip-DRM_I915_LOW_LEVEL_TRACEPOINTS-with-NO.patch +0008-drm-i915-gt-Queue-and-wait-for-the-irq_work-item.patch +0009-drm-i915-gt-Use-spin_lock_irq-instead-of-local_irq_d.patch +0010-drm-i915-Drop-the-irqs_disabled-check.patch + +########################################################################### +# X86: +########################################################################### +signal_x86__Delay_calling_signals_in_atomic.patch +x86__kvm_Require_const_tsc_for_RT.patch +x86__Allow_to_enable_RT.patch +x86__Enable_RT_also_on_32bit.patch + +########################################################################### +# For later, not essencial +########################################################################### +genirq__update_irq_set_irqchip_state_documentation.patch +ASoC-mediatek-mt8195-Remove-unsued-irqs_lock.patch +smack-Guard-smack_ipv6_lock-definition-within-a-SMAC.patch +virt-acrn-Remove-unsued-acrn_irqfds_mutex.patch +tpm_tis__fix_stall_after_iowrites.patch +mm-zsmalloc-Replace-bit-spinlock-and-get_cpu_var-usa.patch +drivers_block_zram__Replace_bit_spinlocks_with_rtmutex_for_-rt.patch +leds-trigger-Disable-CPU-trigger-on-PREEMPT_RT.patch +generic-softirq-Disable-softirq-stacks-on-PREEMPT_RT.patch +softirq-Disable-softirq-stacks-on-PREEMPT_RT.patch + +########################################################################### +# Lazy preemption +########################################################################### +sched__Add_support_for_lazy_preemption.patch +x86_entry__Use_should_resched_in_idtentry_exit_cond_resched.patch +x86__Support_for_lazy_preemption.patch +entry--Fix-the-preempt-lazy-fallout.patch +arm__Add_support_for_lazy_preemption.patch +powerpc__Add_support_for_lazy_preemption.patch +arch_arm64__Add_lazy_preempt_support.patch + +########################################################################### +# ARM/ARM64 +########################################################################### +ARM__enable_irq_in_translation_section_permission_fault_handlers.patch +KVM__arm_arm64__downgrade_preempt_disabled_region_to_migrate_disable.patch +arm64-sve-Delay-freeing-memory-in-fpsimd_flush_threa.patch +arm64-sve-Make-kernel-FPU-protection-RT-friendly.patch +arm64-signal-Use-ARCH_RT_DELAYS_SIGNAL_SEND.patch +tty_serial_omap__Make_the_locking_RT_aware.patch +tty_serial_pl011__Make_the_locking_work_on_RT.patch +ARM__Allow_to_enable_RT.patch +ARM64__Allow_to_enable_RT.patch + +########################################################################### +# POWERPC +########################################################################### +powerpc__traps__Use_PREEMPT_RT.patch +powerpc_pseries_iommu__Use_a_locallock_instead_local_irq_save.patch +powerpc_kvm__Disable_in-kernel_MPIC_emulation_for_PREEMPT_RT.patch +powerpc_stackprotector__work_around_stack-guard_init_from_atomic.patch +POWERPC__Allow_to_enable_RT.patch + +# Sysfs file vs uname() -v +sysfs__Add__sys_kernel_realtime_entry.patch + +########################################################################### +# RT release version +########################################################################### +Add_localversion_for_-RT_release.patch diff --git a/debian/patches-rt/0181-signal-Revert-ptrace-preempt-magic.patch b/debian/patches-rt/signal__Revert_ptrace_preempt_magic.patch index e23992d68..9ee13768d 100644 --- a/debian/patches-rt/0181-signal-Revert-ptrace-preempt-magic.patch +++ b/debian/patches-rt/signal__Revert_ptrace_preempt_magic.patch @@ -1,23 +1,24 @@ -From 887fccf5470571227571c518215c5fe1e876cfa2 Mon Sep 17 00:00:00 2001 +Subject: signal: Revert ptrace preempt magic +From: Thomas Gleixner <tglx@linutronix.de> +Date: Wed Sep 21 19:57:12 2011 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Thomas Gleixner <tglx@linutronix.de> -Date: Wed, 21 Sep 2011 19:57:12 +0200 -Subject: [PATCH 181/296] signal: Revert ptrace preempt magic -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz Upstream commit '53da1d9456fe7f8 fix ptrace slowness' is nothing more than a bandaid around the ptrace design trainwreck. It's not a correctness issue, it's merily a cosmetic bandaid. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - kernel/signal.c | 8 -------- + kernel/signal.c | 8 -------- 1 file changed, 8 deletions(-) - -diff --git a/kernel/signal.c b/kernel/signal.c -index ef8f2a28d37c..bbd1e9dd7e50 100644 +--- --- a/kernel/signal.c +++ b/kernel/signal.c -@@ -2203,16 +2203,8 @@ static void ptrace_stop(int exit_code, int why, int clear_code, kernel_siginfo_t +@@ -2275,16 +2275,8 @@ static void ptrace_stop(int exit_code, i if (gstop_done && ptrace_reparented(current)) do_notify_parent_cldstop(current, false, why); @@ -34,6 +35,3 @@ index ef8f2a28d37c..bbd1e9dd7e50 100644 freezable_schedule(); cgroup_leave_frozen(true); } else { --- -2.30.2 - diff --git a/debian/patches-rt/0198-signal-x86-Delay-calling-signals-in-atomic.patch b/debian/patches-rt/signal_x86__Delay_calling_signals_in_atomic.patch index 5567b5a25..987043395 100644 --- a/debian/patches-rt/0198-signal-x86-Delay-calling-signals-in-atomic.patch +++ b/debian/patches-rt/signal_x86__Delay_calling_signals_in_atomic.patch @@ -1,8 +1,9 @@ -From 2bb047a8abf19071e7e88f91ae8a290cfa64382b Mon Sep 17 00:00:00 2001 +Subject: signal/x86: Delay calling signals in atomic +From: Oleg Nesterov <oleg@redhat.com> +Date: Tue Jul 14 14:26:34 2015 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Oleg Nesterov <oleg@redhat.com> -Date: Tue, 14 Jul 2015 14:26:34 +0200 -Subject: [PATCH 198/296] signal/x86: Delay calling signals in atomic -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz On x86_64 we must disable preemption before we enable interrupts for stack faults, int3 and debugging, because the current task is using @@ -32,15 +33,16 @@ Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [bigeasy: also needed on 32bit as per Yang Shi <yang.shi@linaro.org>] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - arch/x86/include/asm/signal.h | 13 +++++++++++++ - include/linux/sched.h | 4 ++++ - kernel/entry/common.c | 8 ++++++++ - kernel/signal.c | 28 ++++++++++++++++++++++++++++ + arch/x86/include/asm/signal.h | 13 +++++++++++++ + include/linux/sched.h | 4 ++++ + kernel/entry/common.c | 8 ++++++++ + kernel/signal.c | 28 ++++++++++++++++++++++++++++ 4 files changed, 53 insertions(+) - -diff --git a/arch/x86/include/asm/signal.h b/arch/x86/include/asm/signal.h -index 6fd8410a3910..f3bf2f515edb 100644 +--- --- a/arch/x86/include/asm/signal.h +++ b/arch/x86/include/asm/signal.h @@ -28,6 +28,19 @@ typedef struct { @@ -61,13 +63,11 @@ index 6fd8410a3910..f3bf2f515edb 100644 +#endif + #ifndef CONFIG_COMPAT + #define compat_sigset_t compat_sigset_t typedef sigset_t compat_sigset_t; - #endif -diff --git a/include/linux/sched.h b/include/linux/sched.h -index 16f9f402d111..5e21035c7c1e 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h -@@ -994,6 +994,10 @@ struct task_struct { +@@ -1080,6 +1080,10 @@ struct task_struct { /* Restored if set_restore_sigmask() was used: */ sigset_t saved_sigmask; struct sigpending pending; @@ -78,11 +78,9 @@ index 16f9f402d111..5e21035c7c1e 100644 unsigned long sas_ss_sp; size_t sas_ss_size; unsigned int sas_ss_flags; -diff --git a/kernel/entry/common.c b/kernel/entry/common.c -index 59c666d9d43c..c4e495b04178 100644 --- a/kernel/entry/common.c +++ b/kernel/entry/common.c -@@ -152,6 +152,14 @@ static unsigned long exit_to_user_mode_loop(struct pt_regs *regs, +@@ -162,6 +162,14 @@ static unsigned long exit_to_user_mode_l if (ti_work & _TIF_NEED_RESCHED) schedule(); @@ -97,11 +95,9 @@ index 59c666d9d43c..c4e495b04178 100644 if (ti_work & _TIF_UPROBE) uprobe_notify_resume(regs); -diff --git a/kernel/signal.c b/kernel/signal.c -index bbd1e9dd7e50..fde69283d03f 100644 --- a/kernel/signal.c +++ b/kernel/signal.c -@@ -1314,6 +1314,34 @@ force_sig_info_to_task(struct kernel_siginfo *info, struct task_struct *t) +@@ -1317,6 +1317,34 @@ force_sig_info_to_task(struct kernel_sig struct k_sigaction *action; int sig = info->si_signo; @@ -136,6 +132,3 @@ index bbd1e9dd7e50..fde69283d03f 100644 spin_lock_irqsave(&t->sighand->siglock, flags); action = &t->sighand->action[sig-1]; ignored = action->sa.sa_handler == SIG_IGN; --- -2.30.2 - diff --git a/debian/patches-rt/smack-Guard-smack_ipv6_lock-definition-within-a-SMAC.patch b/debian/patches-rt/smack-Guard-smack_ipv6_lock-definition-within-a-SMAC.patch new file mode 100644 index 000000000..1f66e45b7 --- /dev/null +++ b/debian/patches-rt/smack-Guard-smack_ipv6_lock-definition-within-a-SMAC.patch @@ -0,0 +1,73 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Thu, 9 Sep 2021 12:18:29 +0200 +Subject: [PATCH] smack: Guard smack_ipv6_lock definition within a + SMACK_IPV6_PORT_LABELING block +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +The mutex smack_ipv6_lock is only used with the SMACK_IPV6_PORT_LABELING +block but its definition is outside of the block. This leads to a +defined-but-not-used warning on PREEMPT_RT. + +Moving smack_ipv6_lock down to the block where it is used where it used +raises the question why is smk_ipv6_port_list read if nothing is added +to it. +Turns out, only smk_ipv6_port_check() is using it outside of an ifdef +SMACK_IPV6_PORT_LABELING block. However two of three caller invoke +smk_ipv6_port_check() from a ifdef block and only one is using +__is_defined() macro which requires the function and smk_ipv6_port_list +to be around. + +Put the lock and list inside an ifdef SMACK_IPV6_PORT_LABELING block to +avoid the warning regarding unused mutex. Extend the ifdef-block to also +cover smk_ipv6_port_check(). Make smack_socket_connect() use ifdef +instead of __is_defined() to avoid complains about missing function. + +Cc: Casey Schaufler <casey@schaufler-ca.com> +Cc: James Morris <jmorris@namei.org> +Cc: "Serge E. Hallyn" <serge@hallyn.com> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + security/smack/smack_lsm.c | 9 ++++++--- + 1 file changed, 6 insertions(+), 3 deletions(-) + +--- a/security/smack/smack_lsm.c ++++ b/security/smack/smack_lsm.c +@@ -51,8 +51,10 @@ + #define SMK_RECEIVING 1 + #define SMK_SENDING 2 + ++#ifdef SMACK_IPV6_PORT_LABELING + static DEFINE_MUTEX(smack_ipv6_lock); + static LIST_HEAD(smk_ipv6_port_list); ++#endif + struct kmem_cache *smack_rule_cache; + int smack_enabled __initdata; + +@@ -2603,7 +2605,6 @@ static void smk_ipv6_port_label(struct s + mutex_unlock(&smack_ipv6_lock); + return; + } +-#endif + + /** + * smk_ipv6_port_check - check Smack port access +@@ -2666,6 +2667,7 @@ static int smk_ipv6_port_check(struct so + + return smk_ipv6_check(skp, object, address, act); + } ++#endif + + /** + * smack_inode_setsecurity - set smack xattrs +@@ -2852,8 +2854,9 @@ static int smack_socket_connect(struct s + rc = smk_ipv6_check(ssp->smk_out, rsp, sip, + SMK_CONNECTING); + } +- if (__is_defined(SMACK_IPV6_PORT_LABELING)) +- rc = smk_ipv6_port_check(sock->sk, sip, SMK_CONNECTING); ++#ifdef SMACK_IPV6_PORT_LABELING ++ rc = smk_ipv6_port_check(sock->sk, sip, SMK_CONNECTING); ++#endif + + return rc; + } diff --git a/debian/patches-rt/0133-smp-Wake-ksoftirqd-on-PREEMPT_RT-instead-do_softirq.patch b/debian/patches-rt/smp_wake_ksoftirqd_on_preempt_rt_instead_do_softirq.patch index 310835cb9..1ffb510a4 100644 --- a/debian/patches-rt/0133-smp-Wake-ksoftirqd-on-PREEMPT_RT-instead-do_softirq.patch +++ b/debian/patches-rt/smp_wake_ksoftirqd_on_preempt_rt_instead_do_softirq.patch @@ -1,9 +1,7 @@ -From 3a6045ed8d953328cb6bba1e07178ee99d99e95a Mon Sep 17 00:00:00 2001 From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Mon, 15 Feb 2021 18:44:12 +0100 -Subject: [PATCH 133/296] smp: Wake ksoftirqd on PREEMPT_RT instead - do_softirq(). -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz +Subject: smp: Wake ksoftirqd on PREEMPT_RT instead do_softirq(). +Date: Mon, 27 Sep 2021 09:38:14 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz The softirq implementation on PREEMPT_RT does not provide do_softirq(). The other user of do_softirq() is replaced with a local_bh_disable() @@ -14,35 +12,35 @@ preemption. Wake the softirq thread on PREEMPT_RT if there are any pending softirqs. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20210927073814.x5h6osr4dgiu44sc@linutronix.de --- - kernel/smp.c | 14 ++++++++++++-- - 1 file changed, 12 insertions(+), 2 deletions(-) +v1…v2: Drop an empty line. -diff --git a/kernel/smp.c b/kernel/smp.c -index 25240fb2df94..23778281aaa7 100644 + kernel/smp.c | 14 ++++++++++++-- + 1 file changed, 12 insertions(+), 2 deletions(-) +--- --- a/kernel/smp.c +++ b/kernel/smp.c -@@ -450,8 +450,18 @@ void flush_smp_call_function_from_idle(void) +@@ -690,10 +690,20 @@ void flush_smp_call_function_from_idle(v + cfd_seq_store(this_cpu_ptr(&cfd_seq_local)->idle, CFD_SEQ_NOCPU, + smp_processor_id(), CFD_SEQ_IDLE); ++ local_irq_save(flags); flush_smp_call_function_queue(true); - if (local_softirq_pending()) - do_softirq(); + + if (local_softirq_pending()) { -+ + if (!IS_ENABLED(CONFIG_PREEMPT_RT)) { + do_softirq(); + } else { + struct task_struct *ksoftirqd = this_cpu_ksoftirqd(); + -+ if (ksoftirqd && ksoftirqd->state != TASK_RUNNING) ++ if (ksoftirqd && !task_is_running(ksoftirqd)) + wake_up_process(ksoftirqd); + } + } local_irq_restore(flags); } --- -2.30.2 - diff --git a/debian/patches-rt/softirq-Disable-softirq-stacks-on-PREEMPT_RT.patch b/debian/patches-rt/softirq-Disable-softirq-stacks-on-PREEMPT_RT.patch new file mode 100644 index 000000000..d68f39a7b --- /dev/null +++ b/debian/patches-rt/softirq-Disable-softirq-stacks-on-PREEMPT_RT.patch @@ -0,0 +1,88 @@ +From: Thomas Gleixner <tglx@linutronix.de> +Date: Fri, 24 Sep 2021 17:05:48 +0200 +Subject: [PATCH] */softirq: Disable softirq stacks on PREEMPT_RT +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +PREEMPT_RT preempts softirqs and the current implementation avoids +do_softirq_own_stack() and only uses __do_softirq(). + +Disable the unused softirqs stacks on PREEMPT_RT to safe some memory and +ensure that do_softirq_own_stack() is not used which is not expected. + +[bigeasy: commit description.] + +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + arch/powerpc/kernel/irq.c | 4 ++++ + arch/sh/kernel/irq.c | 2 ++ + arch/sparc/kernel/irq_64.c | 2 ++ + 3 files changed, 8 insertions(+) + +--- a/arch/powerpc/kernel/irq.c ++++ b/arch/powerpc/kernel/irq.c +@@ -690,6 +690,7 @@ static inline void check_stack_overflow( + } + } + ++#ifndef CONFIG_PREEMPT_RT + static __always_inline void call_do_softirq(const void *sp) + { + /* Temporarily switch r1 to sp, call __do_softirq() then restore r1. */ +@@ -708,6 +709,7 @@ static __always_inline void call_do_soft + "r11", "r12" + ); + } ++#endif + + static __always_inline void call_do_irq(struct pt_regs *regs, void *sp) + { +@@ -820,10 +822,12 @@ void *mcheckirq_ctx[NR_CPUS] __read_most + void *softirq_ctx[NR_CPUS] __read_mostly; + void *hardirq_ctx[NR_CPUS] __read_mostly; + ++#ifndef CONFIG_PREEMPT_RT + void do_softirq_own_stack(void) + { + call_do_softirq(softirq_ctx[smp_processor_id()]); + } ++#endif + + irq_hw_number_t virq_to_hw(unsigned int virq) + { +--- a/arch/sh/kernel/irq.c ++++ b/arch/sh/kernel/irq.c +@@ -149,6 +149,7 @@ void irq_ctx_exit(int cpu) + hardirq_ctx[cpu] = NULL; + } + ++#ifndef CONFIG_PREEMPT_RT + void do_softirq_own_stack(void) + { + struct thread_info *curctx; +@@ -176,6 +177,7 @@ void do_softirq_own_stack(void) + "r5", "r6", "r7", "r8", "r9", "r15", "t", "pr" + ); + } ++#endif + #else + static inline void handle_one_irq(unsigned int irq) + { +--- a/arch/sparc/kernel/irq_64.c ++++ b/arch/sparc/kernel/irq_64.c +@@ -855,6 +855,7 @@ void __irq_entry handler_irq(int pil, st + set_irq_regs(old_regs); + } + ++#ifndef CONFIG_PREEMPT_RT + void do_softirq_own_stack(void) + { + void *orig_sp, *sp = softirq_stack[smp_processor_id()]; +@@ -869,6 +870,7 @@ void do_softirq_own_stack(void) + __asm__ __volatile__("mov %0, %%sp" + : : "r" (orig_sp)); + } ++#endif + + #ifdef CONFIG_HOTPLUG_CPU + void fixup_irqs(void) diff --git a/debian/patches-rt/softirq__Check_preemption_after_reenabling_interrupts.patch b/debian/patches-rt/softirq__Check_preemption_after_reenabling_interrupts.patch new file mode 100644 index 000000000..5f99e7649 --- /dev/null +++ b/debian/patches-rt/softirq__Check_preemption_after_reenabling_interrupts.patch @@ -0,0 +1,103 @@ +Subject: softirq: Check preemption after reenabling interrupts +From: Thomas Gleixner <tglx@linutronix.de> +Date: Sun Nov 13 17:17:09 2011 +0100 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +From: Thomas Gleixner <tglx@linutronix.de> + +raise_softirq_irqoff() disables interrupts and wakes the softirq +daemon, but after reenabling interrupts there is no preemption check, +so the execution of the softirq thread might be delayed arbitrarily. + +In principle we could add that check to local_irq_enable/restore, but +that's overkill as the rasie_softirq_irqoff() sections are the only +ones which show this behaviour. + +Reported-by: Carsten Emde <cbe@osadl.org> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + + +--- + include/linux/preempt.h | 3 +++ + net/core/dev.c | 7 +++++++ + 2 files changed, 10 insertions(+) +--- +--- a/include/linux/preempt.h ++++ b/include/linux/preempt.h +@@ -191,8 +191,10 @@ do { \ + + #ifndef CONFIG_PREEMPT_RT + # define preempt_enable_no_resched() sched_preempt_enable_no_resched() ++# define preempt_check_resched_rt() barrier(); + #else + # define preempt_enable_no_resched() preempt_enable() ++# define preempt_check_resched_rt() preempt_check_resched() + #endif + + #define preemptible() (preempt_count() == 0 && !irqs_disabled()) +@@ -263,6 +265,7 @@ do { \ + #define preempt_disable_notrace() barrier() + #define preempt_enable_no_resched_notrace() barrier() + #define preempt_enable_notrace() barrier() ++#define preempt_check_resched_rt() barrier() + #define preemptible() 0 + + #endif /* CONFIG_PREEMPT_COUNT */ +--- a/net/core/dev.c ++++ b/net/core/dev.c +@@ -3040,6 +3040,7 @@ static void __netif_reschedule(struct Qd + sd->output_queue_tailp = &q->next_sched; + raise_softirq_irqoff(NET_TX_SOFTIRQ); + local_irq_restore(flags); ++ preempt_check_resched_rt(); + } + + void __netif_schedule(struct Qdisc *q) +@@ -3102,6 +3103,7 @@ void __dev_kfree_skb_irq(struct sk_buff + __this_cpu_write(softnet_data.completion_queue, skb); + raise_softirq_irqoff(NET_TX_SOFTIRQ); + local_irq_restore(flags); ++ preempt_check_resched_rt(); + } + EXPORT_SYMBOL(__dev_kfree_skb_irq); + +@@ -4644,6 +4646,7 @@ static int enqueue_to_backlog(struct sk_ + rps_unlock(sd); + + local_irq_restore(flags); ++ preempt_check_resched_rt(); + + atomic_long_inc(&skb->dev->rx_dropped); + kfree_skb(skb); +@@ -6387,12 +6390,14 @@ static void net_rps_action_and_irq_enabl + sd->rps_ipi_list = NULL; + + local_irq_enable(); ++ preempt_check_resched_rt(); + + /* Send pending IPI's to kick RPS processing on remote cpus. */ + net_rps_send_ipi(remsd); + } else + #endif + local_irq_enable(); ++ preempt_check_resched_rt(); + } + + static bool sd_has_rps_ipi_waiting(struct softnet_data *sd) +@@ -6470,6 +6475,7 @@ void __napi_schedule(struct napi_struct + local_irq_save(flags); + ____napi_schedule(this_cpu_ptr(&softnet_data), n); + local_irq_restore(flags); ++ preempt_check_resched_rt(); + } + EXPORT_SYMBOL(__napi_schedule); + +@@ -11292,6 +11298,7 @@ static int dev_cpu_dead(unsigned int old + + raise_softirq_irqoff(NET_TX_SOFTIRQ); + local_irq_enable(); ++ preempt_check_resched_rt(); + + #ifdef CONFIG_RPS + remsd = oldsd->rps_ipi_list; diff --git a/debian/patches-rt/0237-sunrpc-Make-svc_xprt_do_enqueue-use-get_cpu_light.patch b/debian/patches-rt/sunrpc__Make_svc_xprt_do_enqueue_use_get_cpu_light.patch index eb720bf45..b6de72923 100644 --- a/debian/patches-rt/0237-sunrpc-Make-svc_xprt_do_enqueue-use-get_cpu_light.patch +++ b/debian/patches-rt/sunrpc__Make_svc_xprt_do_enqueue_use_get_cpu_light.patch @@ -1,9 +1,9 @@ -From d70173a76d0ca9b588a1ba4b594c236d90f81f53 Mon Sep 17 00:00:00 2001 +Subject: sunrpc: Make svc_xprt_do_enqueue() use get_cpu_light() +From: Mike Galbraith <umgwanakikbuti@gmail.com> +Date: Wed Feb 18 16:05:28 2015 +0100 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Mike Galbraith <umgwanakikbuti@gmail.com> -Date: Wed, 18 Feb 2015 16:05:28 +0100 -Subject: [PATCH 237/296] sunrpc: Make svc_xprt_do_enqueue() use - get_cpu_light() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz |BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:915 |in_atomic(): 1, irqs_disabled(): 0, pid: 3194, name: rpc.nfsd @@ -30,15 +30,16 @@ Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5 Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - net/sunrpc/svc_xprt.c | 4 ++-- + net/sunrpc/svc_xprt.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) - -diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c -index 06e503466c32..16bd1278a989 100644 +--- --- a/net/sunrpc/svc_xprt.c +++ b/net/sunrpc/svc_xprt.c -@@ -422,7 +422,7 @@ void svc_xprt_do_enqueue(struct svc_xprt *xprt) +@@ -441,7 +441,7 @@ void svc_xprt_do_enqueue(struct svc_xprt if (test_and_set_bit(XPT_BUSY, &xprt->xpt_flags)) return; @@ -47,7 +48,7 @@ index 06e503466c32..16bd1278a989 100644 pool = svc_pool_for_cpu(xprt->xpt_server, cpu); atomic_long_inc(&pool->sp_stats.packets); -@@ -446,7 +446,7 @@ void svc_xprt_do_enqueue(struct svc_xprt *xprt) +@@ -465,7 +465,7 @@ void svc_xprt_do_enqueue(struct svc_xprt rqstp = NULL; out_unlock: rcu_read_unlock(); @@ -56,6 +57,3 @@ index 06e503466c32..16bd1278a989 100644 trace_svc_xprt_do_enqueue(xprt, rqstp); } EXPORT_SYMBOL_GPL(svc_xprt_do_enqueue); --- -2.30.2 - diff --git a/debian/patches-rt/0293-sysfs-Add-sys-kernel-realtime-entry.patch b/debian/patches-rt/sysfs__Add__sys_kernel_realtime_entry.patch index 3bf4454e7..26c81693d 100644 --- a/debian/patches-rt/0293-sysfs-Add-sys-kernel-realtime-entry.patch +++ b/debian/patches-rt/sysfs__Add__sys_kernel_realtime_entry.patch @@ -1,8 +1,9 @@ -From c60fec2c6788e8ad69c67a4bf44018028d80b5f6 Mon Sep 17 00:00:00 2001 +Subject: sysfs: Add /sys/kernel/realtime entry +From: Clark Williams <williams@redhat.com> +Date: Sat Jul 30 21:55:53 2011 -0500 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Clark Williams <williams@redhat.com> -Date: Sat, 30 Jul 2011 21:55:53 -0500 -Subject: [PATCH 293/296] sysfs: Add /sys/kernel/realtime entry -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz Add a /sys/kernel entry to indicate that the kernel is a realtime kernel. @@ -15,12 +16,13 @@ Are there better solutions? Should it exist and return 0 on !-rt? Signed-off-by: Clark Williams <williams@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - kernel/ksysfs.c | 12 ++++++++++++ + kernel/ksysfs.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) - -diff --git a/kernel/ksysfs.c b/kernel/ksysfs.c -index 35859da8bd4f..dfff31ed644a 100644 +--- --- a/kernel/ksysfs.c +++ b/kernel/ksysfs.c @@ -138,6 +138,15 @@ KERNEL_ATTR_RO(vmcoreinfo); @@ -39,16 +41,13 @@ index 35859da8bd4f..dfff31ed644a 100644 /* whether file capabilities are enabled */ static ssize_t fscaps_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) -@@ -228,6 +237,9 @@ static struct attribute * kernel_attrs[] = { - #ifndef CONFIG_TINY_RCU +@@ -229,6 +238,9 @@ static struct attribute * kernel_attrs[] &rcu_expedited_attr.attr, &rcu_normal_attr.attr, -+#endif + #endif +#ifdef CONFIG_PREEMPT_RT + &realtime_attr.attr, - #endif ++#endif NULL }; --- -2.30.2 - + diff --git a/debian/patches-rt/0115-tcp-Remove-superfluous-BH-disable-around-listening_h.patch b/debian/patches-rt/tcp__Remove_superfluous_BH-disable_around_listening_hash.patch index 158cdf7aa..013ac2413 100644 --- a/debian/patches-rt/0115-tcp-Remove-superfluous-BH-disable-around-listening_h.patch +++ b/debian/patches-rt/tcp__Remove_superfluous_BH-disable_around_listening_hash.patch @@ -1,9 +1,9 @@ -From be7c2411d9aa80d55f1a6a3cc2dd1e49bb28d7ef Mon Sep 17 00:00:00 2001 +Subject: tcp: Remove superfluous BH-disable around listening_hash +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Mon Oct 12 17:33:54 2020 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Mon, 12 Oct 2020 17:33:54 +0200 -Subject: [PATCH 115/296] tcp: Remove superfluous BH-disable around - listening_hash -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz Commit 9652dc2eb9e40 ("tcp: relax listening_hash operations") @@ -18,18 +18,19 @@ inet_unhash() conditionally acquires listening_hash->lock. Reported-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/linux-rt-users/12d6f9879a97cd56c09fb53dee343cbb14f7f1f7.camel@gmx.de/ Link: https://lkml.kernel.org/r/X9CheYjuXWc75Spa@hirez.programming.kicks-ass.net + + --- - net/ipv4/inet_hashtables.c | 19 ++++++++++++------- - net/ipv6/inet6_hashtables.c | 5 +---- + net/ipv4/inet_hashtables.c | 19 ++++++++++++------- + net/ipv6/inet6_hashtables.c | 5 +---- 2 files changed, 13 insertions(+), 11 deletions(-) - -diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c -index 45fb450b4522..5fb95030e7c0 100644 +--- --- a/net/ipv4/inet_hashtables.c +++ b/net/ipv4/inet_hashtables.c -@@ -635,7 +635,9 @@ int __inet_hash(struct sock *sk, struct sock *osk) +@@ -637,7 +637,9 @@ int __inet_hash(struct sock *sk, struct int err = 0; if (sk->sk_state != TCP_LISTEN) { @@ -39,7 +40,7 @@ index 45fb450b4522..5fb95030e7c0 100644 return 0; } WARN_ON(!sk_unhashed(sk)); -@@ -667,11 +669,8 @@ int inet_hash(struct sock *sk) +@@ -669,11 +671,8 @@ int inet_hash(struct sock *sk) { int err = 0; @@ -52,7 +53,7 @@ index 45fb450b4522..5fb95030e7c0 100644 return err; } -@@ -682,17 +681,20 @@ void inet_unhash(struct sock *sk) +@@ -684,17 +683,20 @@ void inet_unhash(struct sock *sk) struct inet_hashinfo *hashinfo = sk->sk_prot->h.hashinfo; struct inet_listen_hashbucket *ilb = NULL; spinlock_t *lock; @@ -75,7 +76,7 @@ index 45fb450b4522..5fb95030e7c0 100644 if (sk_unhashed(sk)) goto unlock; -@@ -705,7 +707,10 @@ void inet_unhash(struct sock *sk) +@@ -707,7 +709,10 @@ void inet_unhash(struct sock *sk) __sk_nulls_del_node_init_rcu(sk); sock_prot_inuse_add(sock_net(sk), sk->sk_prot, -1); unlock: @@ -87,8 +88,6 @@ index 45fb450b4522..5fb95030e7c0 100644 } EXPORT_SYMBOL_GPL(inet_unhash); -diff --git a/net/ipv6/inet6_hashtables.c b/net/ipv6/inet6_hashtables.c -index 55c290d55605..9bad345cba9a 100644 --- a/net/ipv6/inet6_hashtables.c +++ b/net/ipv6/inet6_hashtables.c @@ -333,11 +333,8 @@ int inet6_hash(struct sock *sk) @@ -104,6 +103,3 @@ index 55c290d55605..9bad345cba9a 100644 return err; } --- -2.30.2 - diff --git a/debian/patches-rt/0289-tpm_tis-fix-stall-after-iowrite-s.patch b/debian/patches-rt/tpm_tis__fix_stall_after_iowrites.patch index 294620d22..012bed7fd 100644 --- a/debian/patches-rt/0289-tpm_tis-fix-stall-after-iowrite-s.patch +++ b/debian/patches-rt/tpm_tis__fix_stall_after_iowrites.patch @@ -1,8 +1,9 @@ -From e44e04fd210ff636370c751e05df198bb5ed253b Mon Sep 17 00:00:00 2001 +Subject: tpm_tis: fix stall after iowrite*()s +From: Haris Okanovic <haris.okanovic@ni.com> +Date: Tue Aug 15 15:13:08 2017 -0500 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Haris Okanovic <haris.okanovic@ni.com> -Date: Tue, 15 Aug 2017 15:13:08 -0500 -Subject: [PATCH 289/296] tpm_tis: fix stall after iowrite*()s -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz ioread8() operations to TPM MMIO addresses can stall the cpu when immediately following a sequence of iowrite*()'s to the same region. @@ -21,15 +22,16 @@ amortize the cost of flushing data to chip across multiple instructions. Signed-off-by: Haris Okanovic <haris.okanovic@ni.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - drivers/char/tpm/tpm_tis.c | 29 +++++++++++++++++++++++++++-- + drivers/char/tpm/tpm_tis.c | 29 +++++++++++++++++++++++++++-- 1 file changed, 27 insertions(+), 2 deletions(-) - -diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c -index 4ed6e660273a..c2bd0d40b5fc 100644 +--- --- a/drivers/char/tpm/tpm_tis.c +++ b/drivers/char/tpm/tpm_tis.c -@@ -50,6 +50,31 @@ static inline struct tpm_tis_tcg_phy *to_tpm_tis_tcg_phy(struct tpm_tis_data *da +@@ -50,6 +50,31 @@ static inline struct tpm_tis_tcg_phy *to return container_of(data, struct tpm_tis_tcg_phy, priv); } @@ -61,7 +63,7 @@ index 4ed6e660273a..c2bd0d40b5fc 100644 static int interrupts = -1; module_param(interrupts, int, 0444); MODULE_PARM_DESC(interrupts, "Enable interrupts"); -@@ -169,7 +194,7 @@ static int tpm_tcg_write_bytes(struct tpm_tis_data *data, u32 addr, u16 len, +@@ -169,7 +194,7 @@ static int tpm_tcg_write_bytes(struct tp struct tpm_tis_tcg_phy *phy = to_tpm_tis_tcg_phy(data); while (len--) @@ -70,7 +72,7 @@ index 4ed6e660273a..c2bd0d40b5fc 100644 return 0; } -@@ -196,7 +221,7 @@ static int tpm_tcg_write32(struct tpm_tis_data *data, u32 addr, u32 value) +@@ -196,7 +221,7 @@ static int tpm_tcg_write32(struct tpm_ti { struct tpm_tis_tcg_phy *phy = to_tpm_tis_tcg_phy(data); @@ -79,6 +81,3 @@ index 4ed6e660273a..c2bd0d40b5fc 100644 return 0; } --- -2.30.2 - diff --git a/debian/patches-rt/0273-tty-serial-omap-Make-the-locking-RT-aware.patch b/debian/patches-rt/tty_serial_omap__Make_the_locking_RT_aware.patch index dbfdbb121..2b1b57413 100644 --- a/debian/patches-rt/0273-tty-serial-omap-Make-the-locking-RT-aware.patch +++ b/debian/patches-rt/tty_serial_omap__Make_the_locking_RT_aware.patch @@ -1,25 +1,26 @@ -From 5adb6ab8f6e119125e87c57652aaddf6dea80b89 Mon Sep 17 00:00:00 2001 +Subject: tty/serial/omap: Make the locking RT aware +From: Thomas Gleixner <tglx@linutronix.de> +Date: Thu Jul 28 13:32:57 2011 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Thomas Gleixner <tglx@linutronix.de> -Date: Thu, 28 Jul 2011 13:32:57 +0200 -Subject: [PATCH 273/296] tty/serial/omap: Make the locking RT aware -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz The lock is a sleeping lock and local_irq_save() is not the optimsation we are looking for. Redo it to make it work on -RT and non-RT. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - drivers/tty/serial/omap-serial.c | 12 ++++-------- + drivers/tty/serial/omap-serial.c | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-) - -diff --git a/drivers/tty/serial/omap-serial.c b/drivers/tty/serial/omap-serial.c -index 76b94d0ff586..80371598efea 100644 +--- --- a/drivers/tty/serial/omap-serial.c +++ b/drivers/tty/serial/omap-serial.c -@@ -1301,13 +1301,10 @@ serial_omap_console_write(struct console *co, const char *s, - - pm_runtime_get_sync(up->dev); +@@ -1255,13 +1255,10 @@ serial_omap_console_write(struct console + unsigned int ier; + int locked = 1; - local_irq_save(flags); - if (up->port.sysrq) @@ -34,9 +35,9 @@ index 76b94d0ff586..80371598efea 100644 /* * First save the IER then disable the interrupts -@@ -1336,8 +1333,7 @@ serial_omap_console_write(struct console *co, const char *s, - pm_runtime_mark_last_busy(up->dev); - pm_runtime_put_autosuspend(up->dev); +@@ -1288,8 +1285,7 @@ serial_omap_console_write(struct console + check_modem_status(up); + if (locked) - spin_unlock(&up->port.lock); - local_irq_restore(flags); @@ -44,6 +45,3 @@ index 76b94d0ff586..80371598efea 100644 } static int __init --- -2.30.2 - diff --git a/debian/patches-rt/0274-tty-serial-pl011-Make-the-locking-work-on-RT.patch b/debian/patches-rt/tty_serial_pl011__Make_the_locking_work_on_RT.patch index 63d99ad81..d1445223d 100644 --- a/debian/patches-rt/0274-tty-serial-pl011-Make-the-locking-work-on-RT.patch +++ b/debian/patches-rt/tty_serial_pl011__Make_the_locking_work_on_RT.patch @@ -1,22 +1,23 @@ -From bf620e0efede449246d31ebb68102f107bf587e1 Mon Sep 17 00:00:00 2001 +Subject: tty/serial/pl011: Make the locking work on RT +From: Thomas Gleixner <tglx@linutronix.de> +Date: Tue Jan 8 21:36:51 2013 +0100 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Thomas Gleixner <tglx@linutronix.de> -Date: Tue, 8 Jan 2013 21:36:51 +0100 -Subject: [PATCH 274/296] tty/serial/pl011: Make the locking work on RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz The lock is a sleeping lock and local_irq_save() is not the optimsation we are looking for. Redo it to make it work on -RT and non-RT. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - drivers/tty/serial/amba-pl011.c | 17 +++++++++++------ + drivers/tty/serial/amba-pl011.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) - -diff --git a/drivers/tty/serial/amba-pl011.c b/drivers/tty/serial/amba-pl011.c -index 87dc3fc15694..c529a804b6fc 100644 +--- --- a/drivers/tty/serial/amba-pl011.c +++ b/drivers/tty/serial/amba-pl011.c -@@ -2201,18 +2201,24 @@ pl011_console_write(struct console *co, const char *s, unsigned int count) +@@ -2336,18 +2336,24 @@ pl011_console_write(struct console *co, { struct uart_amba_port *uap = amba_ports[co->index]; unsigned int old_cr = 0, new_cr; @@ -45,7 +46,7 @@ index 87dc3fc15694..c529a804b6fc 100644 /* * First save the CR then disable the interrupts -@@ -2238,8 +2244,7 @@ pl011_console_write(struct console *co, const char *s, unsigned int count) +@@ -2373,8 +2379,7 @@ pl011_console_write(struct console *co, pl011_write(old_cr, uap, REG_CR); if (locked) @@ -55,6 +56,3 @@ index 87dc3fc15694..c529a804b6fc 100644 clk_disable(uap->clk); } --- -2.30.2 - diff --git a/debian/patches-rt/0186-u64_stats-Disable-preemption-on-32bit-UP-SMP-with-RT.patch b/debian/patches-rt/u64_stats__Disable_preemption_on_32bit-UP_SMP_with_RT_during_updates.patch index b84e4810c..d962a4519 100644 --- a/debian/patches-rt/0186-u64_stats-Disable-preemption-on-32bit-UP-SMP-with-RT.patch +++ b/debian/patches-rt/u64_stats__Disable_preemption_on_32bit-UP_SMP_with_RT_during_updates.patch @@ -1,9 +1,9 @@ -From efde4e7128917ee2dada55ff0a9733e81f4dbc7e Mon Sep 17 00:00:00 2001 +Subject: u64_stats: Disable preemption on 32bit-UP/SMP with RT during updates +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Mon Aug 17 12:28:10 2020 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Mon, 17 Aug 2020 12:28:10 +0200 -Subject: [PATCH 186/296] u64_stats: Disable preemption on 32bit-UP/SMP with RT - during updates -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz On RT the seqcount_t is required even on UP because the softirq can be preempted. The IRQ handler is threaded so it is also preemptible. @@ -14,12 +14,13 @@ disabling preemption is enough to guarantee that the update is not interruped. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - include/linux/u64_stats_sync.h | 42 ++++++++++++++++++++++------------ + include/linux/u64_stats_sync.h | 42 +++++++++++++++++++++++++++-------------- 1 file changed, 28 insertions(+), 14 deletions(-) - -diff --git a/include/linux/u64_stats_sync.h b/include/linux/u64_stats_sync.h -index e81856c0ba13..66eb968a09d4 100644 +--- --- a/include/linux/u64_stats_sync.h +++ b/include/linux/u64_stats_sync.h @@ -66,7 +66,7 @@ @@ -31,7 +32,7 @@ index e81856c0ba13..66eb968a09d4 100644 seqcount_t seq; #endif }; -@@ -115,7 +115,7 @@ static inline void u64_stats_inc(u64_stats_t *p) +@@ -125,7 +125,7 @@ static inline void u64_stats_inc(u64_sta } #endif @@ -40,7 +41,7 @@ index e81856c0ba13..66eb968a09d4 100644 #define u64_stats_init(syncp) seqcount_init(&(syncp)->seq) #else static inline void u64_stats_init(struct u64_stats_sync *syncp) -@@ -125,15 +125,19 @@ static inline void u64_stats_init(struct u64_stats_sync *syncp) +@@ -135,15 +135,19 @@ static inline void u64_stats_init(struct static inline void u64_stats_update_begin(struct u64_stats_sync *syncp) { @@ -62,7 +63,7 @@ index e81856c0ba13..66eb968a09d4 100644 #endif } -@@ -142,8 +146,11 @@ u64_stats_update_begin_irqsave(struct u64_stats_sync *syncp) +@@ -152,8 +156,11 @@ u64_stats_update_begin_irqsave(struct u6 { unsigned long flags = 0; @@ -76,7 +77,7 @@ index e81856c0ba13..66eb968a09d4 100644 write_seqcount_begin(&syncp->seq); #endif return flags; -@@ -153,15 +160,18 @@ static inline void +@@ -163,15 +170,18 @@ static inline void u64_stats_update_end_irqrestore(struct u64_stats_sync *syncp, unsigned long flags) { @@ -98,7 +99,7 @@ index e81856c0ba13..66eb968a09d4 100644 return read_seqcount_begin(&syncp->seq); #else return 0; -@@ -170,7 +180,7 @@ static inline unsigned int __u64_stats_fetch_begin(const struct u64_stats_sync * +@@ -180,7 +190,7 @@ static inline unsigned int __u64_stats_f static inline unsigned int u64_stats_fetch_begin(const struct u64_stats_sync *syncp) { @@ -107,7 +108,7 @@ index e81856c0ba13..66eb968a09d4 100644 preempt_disable(); #endif return __u64_stats_fetch_begin(syncp); -@@ -179,7 +189,7 @@ static inline unsigned int u64_stats_fetch_begin(const struct u64_stats_sync *sy +@@ -189,7 +199,7 @@ static inline unsigned int u64_stats_fet static inline bool __u64_stats_fetch_retry(const struct u64_stats_sync *syncp, unsigned int start) { @@ -116,7 +117,7 @@ index e81856c0ba13..66eb968a09d4 100644 return read_seqcount_retry(&syncp->seq, start); #else return false; -@@ -189,7 +199,7 @@ static inline bool __u64_stats_fetch_retry(const struct u64_stats_sync *syncp, +@@ -199,7 +209,7 @@ static inline bool __u64_stats_fetch_ret static inline bool u64_stats_fetch_retry(const struct u64_stats_sync *syncp, unsigned int start) { @@ -125,7 +126,7 @@ index e81856c0ba13..66eb968a09d4 100644 preempt_enable(); #endif return __u64_stats_fetch_retry(syncp, start); -@@ -203,7 +213,9 @@ static inline bool u64_stats_fetch_retry(const struct u64_stats_sync *syncp, +@@ -213,7 +223,9 @@ static inline bool u64_stats_fetch_retry */ static inline unsigned int u64_stats_fetch_begin_irq(const struct u64_stats_sync *syncp) { @@ -136,7 +137,7 @@ index e81856c0ba13..66eb968a09d4 100644 local_irq_disable(); #endif return __u64_stats_fetch_begin(syncp); -@@ -212,7 +224,9 @@ static inline unsigned int u64_stats_fetch_begin_irq(const struct u64_stats_sync +@@ -222,7 +234,9 @@ static inline unsigned int u64_stats_fet static inline bool u64_stats_fetch_retry_irq(const struct u64_stats_sync *syncp, unsigned int start) { @@ -147,6 +148,3 @@ index e81856c0ba13..66eb968a09d4 100644 local_irq_enable(); #endif return __u64_stats_fetch_retry(syncp, start); --- -2.30.2 - diff --git a/debian/patches-rt/virt-acrn-Remove-unsued-acrn_irqfds_mutex.patch b/debian/patches-rt/virt-acrn-Remove-unsued-acrn_irqfds_mutex.patch new file mode 100644 index 000000000..97b10c599 --- /dev/null +++ b/debian/patches-rt/virt-acrn-Remove-unsued-acrn_irqfds_mutex.patch @@ -0,0 +1,26 @@ +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Thu, 9 Sep 2021 10:15:30 +0200 +Subject: [PATCH] virt: acrn: Remove unsued acrn_irqfds_mutex. +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +acrn_irqfds_mutex is not used, never was. + +Remove acrn_irqfds_mutex. + +Fixes: aa3b483ff1d71 ("virt: acrn: Introduce irqfd") +Cc: Fei Li <fei1.li@intel.com> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +--- + drivers/virt/acrn/irqfd.c | 1 - + 1 file changed, 1 deletion(-) + +--- a/drivers/virt/acrn/irqfd.c ++++ b/drivers/virt/acrn/irqfd.c +@@ -17,7 +17,6 @@ + #include "acrn_drv.h" + + static LIST_HEAD(acrn_irqfd_clients); +-static DEFINE_MUTEX(acrn_irqfds_mutex); + + /** + * struct hsm_irqfd - Properties of HSM irqfd diff --git a/debian/patches-rt/x86-softirq-Disable-softirq-stacks-on-PREEMPT_RT.patch b/debian/patches-rt/x86-softirq-Disable-softirq-stacks-on-PREEMPT_RT.patch new file mode 100644 index 000000000..da12526e4 --- /dev/null +++ b/debian/patches-rt/x86-softirq-Disable-softirq-stacks-on-PREEMPT_RT.patch @@ -0,0 +1,58 @@ +From: Thomas Gleixner <tglx@linutronix.de> +Subject: x86/softirq: Disable softirq stacks on PREEMPT_RT +Date: Fri, 24 Sep 2021 18:12:45 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + +PREEMPT_RT preempts softirqs and the current implementation avoids +do_softirq_own_stack() and only uses __do_softirq(). + +Disable the unused softirqs stacks on PREEMPT_RT to safe some memory and +ensure that do_softirq_own_stack() is not used which is not expected. + +[bigeasy: commit description.] + +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> +Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Link: https://lore.kernel.org/r/20210924161245.2357247-1-bigeasy@linutronix.de +--- + arch/x86/include/asm/irq_stack.h | 3 +++ + arch/x86/kernel/irq_32.c | 2 ++ + 2 files changed, 5 insertions(+) + +--- a/arch/x86/include/asm/irq_stack.h ++++ b/arch/x86/include/asm/irq_stack.h +@@ -201,6 +201,7 @@ + IRQ_CONSTRAINTS, regs, vector); \ + } + ++#ifndef CONFIG_PREEMPT_RT + /* + * Macro to invoke __do_softirq on the irq stack. This is only called from + * task context when bottom halves are about to be reenabled and soft +@@ -214,6 +215,8 @@ + __this_cpu_write(hardirq_stack_inuse, false); \ + } + ++#endif ++ + #else /* CONFIG_X86_64 */ + /* System vector handlers always run on the stack they interrupted. */ + #define run_sysvec_on_irqstack_cond(func, regs) \ +--- a/arch/x86/kernel/irq_32.c ++++ b/arch/x86/kernel/irq_32.c +@@ -132,6 +132,7 @@ int irq_init_percpu_irqstack(unsigned in + return 0; + } + ++#ifndef CONFIG_PREEMPT_RT + void do_softirq_own_stack(void) + { + struct irq_stack *irqstk; +@@ -148,6 +149,7 @@ void do_softirq_own_stack(void) + + call_on_stack(__do_softirq, isp); + } ++#endif + + void __handle_irq(struct irq_desc *desc, struct pt_regs *regs) + { diff --git a/debian/patches-rt/0263-x86-Allow-to-enable-RT.patch b/debian/patches-rt/x86__Allow_to_enable_RT.patch index 9732853ca..ffa6d3946 100644 --- a/debian/patches-rt/0263-x86-Allow-to-enable-RT.patch +++ b/debian/patches-rt/x86__Allow_to_enable_RT.patch @@ -1,18 +1,20 @@ -From ded1f7f8d2dd0142319e9ed4620867a26f192d8b Mon Sep 17 00:00:00 2001 +Subject: x86: Allow to enable RT +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Wed Aug 7 18:15:38 2019 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Wed, 7 Aug 2019 18:15:38 +0200 -Subject: [PATCH 263/296] x86: Allow to enable RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz Allow to select RT. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - arch/x86/Kconfig | 1 + + arch/x86/Kconfig | 1 + 1 file changed, 1 insertion(+) - -diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig -index 7e1fd20234db..7b001f7ec3e8 100644 +--- --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -27,6 +27,7 @@ config X86_64 @@ -23,6 +25,3 @@ index 7e1fd20234db..7b001f7ec3e8 100644 select ARCH_USE_CMPXCHG_LOCKREF select HAVE_ARCH_SOFT_DIRTY select MODULES_USE_ELF_RELA --- -2.30.2 - diff --git a/debian/patches-rt/0279-x86-Enable-RT-also-on-32bit.patch b/debian/patches-rt/x86__Enable_RT_also_on_32bit.patch index ee5569fc0..c9058c476 100644 --- a/debian/patches-rt/0279-x86-Enable-RT-also-on-32bit.patch +++ b/debian/patches-rt/x86__Enable_RT_also_on_32bit.patch @@ -1,16 +1,18 @@ -From cec4f64454684e533c3bc23b5792c2f2e37a0612 Mon Sep 17 00:00:00 2001 +Subject: x86: Enable RT also on 32bit +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Thu Nov 7 17:49:20 2019 +0100 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Thu, 7 Nov 2019 17:49:20 +0100 -Subject: [PATCH 279/296] x86: Enable RT also on 32bit -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - arch/x86/Kconfig | 2 +- + arch/x86/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig -index 252b8bdf1c71..a2c12ad69173 100644 +--- --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -27,7 +27,6 @@ config X86_64 @@ -21,14 +23,11 @@ index 252b8bdf1c71..a2c12ad69173 100644 select ARCH_USE_CMPXCHG_LOCKREF select HAVE_ARCH_SOFT_DIRTY select MODULES_USE_ELF_RELA -@@ -95,6 +94,7 @@ config X86 - select ARCH_SUPPORTS_ACPI - select ARCH_SUPPORTS_ATOMIC_RMW - select ARCH_SUPPORTS_NUMA_BALANCING if X86_64 +@@ -108,6 +107,7 @@ config X86 + select ARCH_SUPPORTS_KMAP_LOCAL_FORCE_MAP if NR_CPUS <= 4096 + select ARCH_SUPPORTS_LTO_CLANG + select ARCH_SUPPORTS_LTO_CLANG_THIN + select ARCH_SUPPORTS_RT select ARCH_USE_BUILTIN_BSWAP + select ARCH_USE_MEMTEST select ARCH_USE_QUEUED_RWLOCKS - select ARCH_USE_QUEUED_SPINLOCKS --- -2.30.2 - diff --git a/debian/patches-rt/0267-x86-Support-for-lazy-preemption.patch b/debian/patches-rt/x86__Support_for_lazy_preemption.patch index 37a92ae8d..825844f50 100644 --- a/debian/patches-rt/0267-x86-Support-for-lazy-preemption.patch +++ b/debian/patches-rt/x86__Support_for_lazy_preemption.patch @@ -1,25 +1,26 @@ -From 4cef1917310065d68231c36c8ae7af585c99d1da Mon Sep 17 00:00:00 2001 +Subject: x86: Support for lazy preemption +From: Thomas Gleixner <tglx@linutronix.de> +Date: Thu Nov 1 11:03:47 2012 +0100 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Thomas Gleixner <tglx@linutronix.de> -Date: Thu, 1 Nov 2012 11:03:47 +0100 -Subject: [PATCH 267/296] x86: Support for lazy preemption -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz Implement the x86 pieces for lazy preempt. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> ---- - arch/x86/Kconfig | 1 + - arch/x86/include/asm/preempt.h | 33 +++++++++++++++++++++++++++++- - arch/x86/include/asm/thread_info.h | 11 ++++++++++ - include/linux/entry-common.h | 2 +- - kernel/entry/common.c | 2 +- - 5 files changed, 46 insertions(+), 3 deletions(-) -diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig -index 7b001f7ec3e8..252b8bdf1c71 100644 + +--- + arch/x86/Kconfig | 1 + + arch/x86/include/asm/preempt.h | 33 ++++++++++++++++++++++++++++++++- + arch/x86/include/asm/thread_info.h | 7 +++++++ + include/linux/entry-common.h | 2 +- + kernel/entry/common.c | 2 +- + 5 files changed, 42 insertions(+), 3 deletions(-) +--- --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig -@@ -212,6 +212,7 @@ config X86 +@@ -231,6 +231,7 @@ config X86 select HAVE_PCI select HAVE_PERF_REGS select HAVE_PERF_USER_STACK_DUMP @@ -27,11 +28,9 @@ index 7b001f7ec3e8..252b8bdf1c71 100644 select MMU_GATHER_RCU_TABLE_FREE if PARAVIRT select HAVE_POSIX_CPU_TIMERS_TASK_WORK select HAVE_REGS_AND_STACK_ACCESS_API -diff --git a/arch/x86/include/asm/preempt.h b/arch/x86/include/asm/preempt.h -index 5ef93c81b274..471dec2d78e1 100644 --- a/arch/x86/include/asm/preempt.h +++ b/arch/x86/include/asm/preempt.h -@@ -89,17 +89,48 @@ static __always_inline void __preempt_count_sub(int val) +@@ -90,17 +90,48 @@ static __always_inline void __preempt_co * a decrement which hits zero means we have no preempt_count and should * reschedule. */ @@ -81,52 +80,40 @@ index 5ef93c81b274..471dec2d78e1 100644 } #ifdef CONFIG_PREEMPTION -diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h -index e701f29b4881..596a46c4a05d 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h -@@ -56,17 +56,24 @@ struct task_struct; - struct thread_info { +@@ -57,11 +57,14 @@ struct thread_info { unsigned long flags; /* low level flags */ + unsigned long syscall_work; /* SYSCALL_WORK_ flags */ u32 status; /* thread synchronous flags */ + int preempt_lazy_count; /* 0 => lazy preemptable -+ <0 => BUG */ ++ <0 => BUG */ }; #define INIT_THREAD_INFO(tsk) \ { \ .flags = 0, \ -+ .preempt_lazy_count = 0, \ ++ .preempt_lazy_count = 0, \ } #else /* !__ASSEMBLY__ */ - - #include <asm/asm-offsets.h> - -+#define GET_THREAD_INFO(reg) \ -+ _ASM_MOV PER_CPU_VAR(cpu_current_top_of_stack),reg ; \ -+ _ASM_SUB $(THREAD_SIZE),reg ; -+ - #endif - - /* -@@ -93,6 +100,7 @@ struct thread_info { +@@ -90,6 +93,7 @@ struct thread_info { #define TIF_NOTSC 16 /* TSC is not accessible in userland */ - #define TIF_IA32 17 /* IA32 compatibility process */ + #define TIF_NOTIFY_SIGNAL 17 /* signal notifications exist */ #define TIF_SLD 18 /* Restore split lock detection on context switch */ +#define TIF_NEED_RESCHED_LAZY 19 /* lazy rescheduling necessary */ #define TIF_MEMDIE 20 /* is terminating due to OOM killer */ #define TIF_POLLING_NRFLAG 21 /* idle is polling for TIF_NEED_RESCHED */ #define TIF_IO_BITMAP 22 /* uses I/O bitmap */ -@@ -122,6 +130,7 @@ struct thread_info { +@@ -114,6 +118,7 @@ struct thread_info { #define _TIF_NOTSC (1 << TIF_NOTSC) - #define _TIF_IA32 (1 << TIF_IA32) + #define _TIF_NOTIFY_SIGNAL (1 << TIF_NOTIFY_SIGNAL) #define _TIF_SLD (1 << TIF_SLD) +#define _TIF_NEED_RESCHED_LAZY (1 << TIF_NEED_RESCHED_LAZY) #define _TIF_POLLING_NRFLAG (1 << TIF_POLLING_NRFLAG) #define _TIF_IO_BITMAP (1 << TIF_IO_BITMAP) - #define _TIF_FORCED_TF (1 << TIF_FORCED_TF) -@@ -154,6 +163,8 @@ struct thread_info { + #define _TIF_SPEC_FORCE_UPDATE (1 << TIF_SPEC_FORCE_UPDATE) +@@ -145,6 +150,8 @@ struct thread_info { #define _TIF_WORK_CTXSW_NEXT (_TIF_WORK_CTXSW) @@ -135,24 +122,20 @@ index e701f29b4881..596a46c4a05d 100644 #define STACK_WARN (THREAD_SIZE/8) /* -diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h -index 7dff07713a07..78765caeabc0 100644 --- a/include/linux/entry-common.h +++ b/include/linux/entry-common.h -@@ -69,7 +69,7 @@ +@@ -59,7 +59,7 @@ #define EXIT_TO_USER_MODE_WORK \ (_TIF_SIGPENDING | _TIF_NOTIFY_RESUME | _TIF_UPROBE | \ -- _TIF_NEED_RESCHED | _TIF_PATCH_PENDING | \ -+ _TIF_NEED_RESCHED_MASK | _TIF_PATCH_PENDING | \ +- _TIF_NEED_RESCHED | _TIF_PATCH_PENDING | _TIF_NOTIFY_SIGNAL | \ ++ _TIF_NEED_RESCHED_MASK | _TIF_PATCH_PENDING | _TIF_NOTIFY_SIGNAL | \ ARCH_EXIT_TO_USER_MODE_WORK) /** -diff --git a/kernel/entry/common.c b/kernel/entry/common.c -index e579b2ff4f94..e73fcc57e367 100644 --- a/kernel/entry/common.c +++ b/kernel/entry/common.c -@@ -149,7 +149,7 @@ static unsigned long exit_to_user_mode_loop(struct pt_regs *regs, +@@ -159,7 +159,7 @@ static unsigned long exit_to_user_mode_l local_irq_enable_exit_to_user(ti_work); @@ -161,6 +144,3 @@ index e579b2ff4f94..e73fcc57e367 100644 schedule(); #ifdef ARCH_RT_DELAYS_SIGNAL_SEND --- -2.30.2 - diff --git a/debian/patches-rt/0215-x86-kvm-Require-const-tsc-for-RT.patch b/debian/patches-rt/x86__kvm_Require_const_tsc_for_RT.patch index 61c077ee0..bf084fcfe 100644 --- a/debian/patches-rt/0215-x86-kvm-Require-const-tsc-for-RT.patch +++ b/debian/patches-rt/x86__kvm_Require_const_tsc_for_RT.patch @@ -1,8 +1,9 @@ -From 5ca00fdacf614a7da5ddf44f7c1589e523e35d16 Mon Sep 17 00:00:00 2001 +Subject: x86: kvm Require const tsc for RT +From: Thomas Gleixner <tglx@linutronix.de> +Date: Sun Nov 6 12:26:18 2011 +0100 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Thomas Gleixner <tglx@linutronix.de> -Date: Sun, 6 Nov 2011 12:26:18 +0100 -Subject: [PATCH 215/296] x86: kvm Require const tsc for RT -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz Non constant TSC is a nightmare on bare metal already, but with virtualization it becomes a complete disaster because the workarounds @@ -10,15 +11,15 @@ are horrible latency wise. That's also a preliminary for running RT in a guest on top of a RT host. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - arch/x86/kvm/x86.c | 8 ++++++++ + arch/x86/kvm/x86.c | 8 ++++++++ 1 file changed, 8 insertions(+) - -diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c -index 0d8383b82bca..b53e8e693ee5 100644 +--- --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c -@@ -7908,6 +7908,14 @@ int kvm_arch_init(void *opaque) +@@ -8433,6 +8433,14 @@ int kvm_arch_init(void *opaque) goto out; } @@ -33,6 +34,3 @@ index 0d8383b82bca..b53e8e693ee5 100644 r = -ENOMEM; x86_fpu_cache = kmem_cache_create("x86_fpu", sizeof(struct fpu), __alignof__(struct fpu), SLAB_ACCOUNT, --- -2.30.2 - diff --git a/debian/patches-rt/0250-x86-stackprotector-Avoid-random-pool-on-rt.patch b/debian/patches-rt/x86__stackprotector__Avoid_random_pool_on_rt.patch index 2a8ca050e..a97efc1e6 100644 --- a/debian/patches-rt/0250-x86-stackprotector-Avoid-random-pool-on-rt.patch +++ b/debian/patches-rt/x86__stackprotector__Avoid_random_pool_on_rt.patch @@ -1,8 +1,9 @@ -From 80bbd1d1f4be48b62ddcb481b241a09047ba0241 Mon Sep 17 00:00:00 2001 +Subject: x86: stackprotector: Avoid random pool on rt +From: Thomas Gleixner <tglx@linutronix.de> +Date: Thu Dec 16 14:25:18 2010 +0100 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Thomas Gleixner <tglx@linutronix.de> -Date: Thu, 16 Dec 2010 14:25:18 +0100 -Subject: [PATCH 250/296] x86: stackprotector: Avoid random pool on rt -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz CPU bringup calls into the random pool to initialize the stack canary. During boot that works nicely even on RT as the might sleep @@ -14,15 +15,16 @@ entropy and we rely on the TSC randomnness. Reported-by: Carsten Emde <carsten.emde@osadl.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + + --- - arch/x86/include/asm/stackprotector.h | 8 +++++++- + arch/x86/include/asm/stackprotector.h | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) - -diff --git a/arch/x86/include/asm/stackprotector.h b/arch/x86/include/asm/stackprotector.h -index 7fb482f0f25b..3df0a95c9e13 100644 +--- --- a/arch/x86/include/asm/stackprotector.h +++ b/arch/x86/include/asm/stackprotector.h -@@ -65,7 +65,7 @@ +@@ -50,7 +50,7 @@ */ static __always_inline void boot_init_stack_canary(void) { @@ -31,7 +33,7 @@ index 7fb482f0f25b..3df0a95c9e13 100644 u64 tsc; #ifdef CONFIG_X86_64 -@@ -76,8 +76,14 @@ static __always_inline void boot_init_stack_canary(void) +@@ -61,8 +61,14 @@ static __always_inline void boot_init_st * of randomness. The TSC only matters for very early init, * there it already has some randomness on most systems. Later * on during the bootup the random pool has true entropy too. @@ -46,6 +48,3 @@ index 7fb482f0f25b..3df0a95c9e13 100644 tsc = rdtsc(); canary += tsc + (tsc << 32UL); canary &= CANARY_MASK; --- -2.30.2 - diff --git a/debian/patches-rt/0266-x86-entry-Use-should_resched-in-idtentry_exit_cond_r.patch b/debian/patches-rt/x86_entry__Use_should_resched_in_idtentry_exit_cond_resched.patch index 8f57471f0..3f8b7bd6f 100644 --- a/debian/patches-rt/0266-x86-entry-Use-should_resched-in-idtentry_exit_cond_r.patch +++ b/debian/patches-rt/x86_entry__Use_should_resched_in_idtentry_exit_cond_resched.patch @@ -1,9 +1,9 @@ -From d2766c62e7566a95468cc743802fec7b301ad4d7 Mon Sep 17 00:00:00 2001 +Subject: x86/entry: Use should_resched() in idtentry_exit_cond_resched() +From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Date: Tue Jun 30 11:45:14 2020 +0200 +Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.15/older/patches-5.15.3-rt21.tar.xz + From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> -Date: Tue, 30 Jun 2020 11:45:14 +0200 -Subject: [PATCH 266/296] x86/entry: Use should_resched() in - idtentry_exit_cond_resched() -Origin: https://www.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.35-rt39.tar.xz The TIF_NEED_RESCHED bit is inlined on x86 into the preemption counter. By using should_resched(0) instead of need_resched() the same check can @@ -13,15 +13,16 @@ issued before. Use should_resched(0) instead need_resched(). Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> +Signed-off-by: Thomas Gleixner <tglx@linutronix.de> + + --- - kernel/entry/common.c | 2 +- + kernel/entry/common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/kernel/entry/common.c b/kernel/entry/common.c -index c4e495b04178..e579b2ff4f94 100644 +--- --- a/kernel/entry/common.c +++ b/kernel/entry/common.c -@@ -363,7 +363,7 @@ void irqentry_exit_cond_resched(void) +@@ -395,7 +395,7 @@ void irqentry_exit_cond_resched(void) rcu_irq_exit_check_preempt(); if (IS_ENABLED(CONFIG_DEBUG_ENTRY)) WARN_ON_ONCE(!on_thread_stack()); @@ -30,6 +31,3 @@ index c4e495b04178..e579b2ff4f94 100644 preempt_schedule_irq(); } } --- -2.30.2 - diff --git a/debian/patches/bugfix/all/HID-apple-Add-missing-scan-code-event-for-keys-handl.patch b/debian/patches/bugfix/all/HID-apple-Add-missing-scan-code-event-for-keys-handl.patch deleted file mode 100644 index 4cccc3896..000000000 --- a/debian/patches/bugfix/all/HID-apple-Add-missing-scan-code-event-for-keys-handl.patch +++ /dev/null @@ -1,116 +0,0 @@ -From: Vincent Lefevre <vincent@vinc17.net> -Date: Thu, 22 Jul 2021 03:25:44 +0200 -Subject: HID: apple: Add missing scan code event for keys handled by hid-apple -Origin: https://git.kernel.org/linus/3b41fb4094914903fd8e50a13def9e47763dc101 -Bug-Debian: https://bugs.debian.org/757356 - -When an EV_KEY event is generated by hid-apple due to special key -mapping, the usual associated scan code event (EV_MSC) is missing. -This issue can be seen with the evtest utility. - -Add the scan code event for these special keys. - -BugLink: https://bugs.debian.org/757356 -Co-developed-by: Daniel Lin <ephemient@gmail.com> -Signed-off-by: Daniel Lin <ephemient@gmail.com> -Signed-off-by: Vincent Lefevre <vincent@vinc17.net> -Signed-off-by: Jiri Kosina <jkosina@suse.cz> ---- - drivers/hid/hid-apple.c | 32 +++++++++++++++++++++++--------- - 1 file changed, 23 insertions(+), 9 deletions(-) - -diff --git a/drivers/hid/hid-apple.c b/drivers/hid/hid-apple.c -index 6b8f0d004d34..cde92de7fca7 100644 ---- a/drivers/hid/hid-apple.c -+++ b/drivers/hid/hid-apple.c -@@ -187,6 +187,15 @@ static const struct apple_key_translation *apple_find_translation( - return NULL; - } - -+static void input_event_with_scancode(struct input_dev *input, -+ __u8 type, __u16 code, unsigned int hid, __s32 value) -+{ -+ if (type == EV_KEY && -+ (!test_bit(code, input->key)) == value) -+ input_event(input, EV_MSC, MSC_SCAN, hid); -+ input_event(input, type, code, value); -+} -+ - static int hidinput_apple_event(struct hid_device *hid, struct input_dev *input, - struct hid_usage *usage, __s32 value) - { -@@ -199,7 +208,8 @@ static int hidinput_apple_event(struct hid_device *hid, struct input_dev *input, - - if (usage->code == fn_keycode) { - asc->fn_on = !!value; -- input_event(input, usage->type, KEY_FN, value); -+ input_event_with_scancode(input, usage->type, KEY_FN, -+ usage->hid, value); - return 1; - } - -@@ -240,7 +250,8 @@ static int hidinput_apple_event(struct hid_device *hid, struct input_dev *input, - code = do_translate ? trans->to : trans->from; - } - -- input_event(input, usage->type, code, value); -+ input_event_with_scancode(input, usage->type, code, -+ usage->hid, value); - return 1; - } - -@@ -258,8 +269,8 @@ static int hidinput_apple_event(struct hid_device *hid, struct input_dev *input, - clear_bit(usage->code, - asc->pressed_numlock); - -- input_event(input, usage->type, trans->to, -- value); -+ input_event_with_scancode(input, usage->type, -+ trans->to, usage->hid, value); - } - - return 1; -@@ -270,7 +281,8 @@ static int hidinput_apple_event(struct hid_device *hid, struct input_dev *input, - if (hid->country == HID_COUNTRY_INTERNATIONAL_ISO) { - trans = apple_find_translation(apple_iso_keyboard, usage->code); - if (trans) { -- input_event(input, usage->type, trans->to, value); -+ input_event_with_scancode(input, usage->type, -+ trans->to, usage->hid, value); - return 1; - } - } -@@ -279,7 +291,8 @@ static int hidinput_apple_event(struct hid_device *hid, struct input_dev *input, - if (swap_opt_cmd) { - trans = apple_find_translation(swapped_option_cmd_keys, usage->code); - if (trans) { -- input_event(input, usage->type, trans->to, value); -+ input_event_with_scancode(input, usage->type, -+ trans->to, usage->hid, value); - return 1; - } - } -@@ -287,7 +300,8 @@ static int hidinput_apple_event(struct hid_device *hid, struct input_dev *input, - if (swap_fn_leftctrl) { - trans = apple_find_translation(swapped_fn_leftctrl_keys, usage->code); - if (trans) { -- input_event(input, usage->type, trans->to, value); -+ input_event_with_scancode(input, usage->type, -+ trans->to, usage->hid, value); - return 1; - } - } -@@ -306,8 +320,8 @@ static int apple_event(struct hid_device *hdev, struct hid_field *field, - - if ((asc->quirks & APPLE_INVERT_HWHEEL) && - usage->code == REL_HWHEEL) { -- input_event(field->hidinput->input, usage->type, usage->code, -- -value); -+ input_event_with_scancode(field->hidinput->input, usage->type, -+ usage->code, usage->hid, -value); - return 1; - } - --- -2.33.0 - diff --git a/debian/patches/bugfix/all/USB-gadget-detect-too-big-endpoint-0-requests.patch b/debian/patches/bugfix/all/USB-gadget-detect-too-big-endpoint-0-requests.patch new file mode 100644 index 000000000..68c0ab6de --- /dev/null +++ b/debian/patches/bugfix/all/USB-gadget-detect-too-big-endpoint-0-requests.patch @@ -0,0 +1,112 @@ +From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> +Date: Thu, 9 Dec 2021 18:59:27 +0100 +Subject: USB: gadget: detect too-big endpoint 0 requests +Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit?id=36dfdf11af49d3c009c711fb16f5c6e7a274505d +Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2021-39685 + +commit 153a2d7e3350cc89d406ba2d35be8793a64c2038 upstream. + +Sometimes USB hosts can ask for buffers that are too large from endpoint +0, which should not be allowed. If this happens for OUT requests, stall +the endpoint, but for IN requests, trim the request size to the endpoint +buffer size. + +Co-developed-by: Szymon Heidrich <szymon.heidrich@gmail.com> +Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> +--- + drivers/usb/gadget/composite.c | 12 ++++++++++++ + drivers/usb/gadget/legacy/dbgp.c | 13 +++++++++++++ + drivers/usb/gadget/legacy/inode.c | 16 +++++++++++++++- + 3 files changed, 40 insertions(+), 1 deletion(-) + +diff --git a/drivers/usb/gadget/composite.c b/drivers/usb/gadget/composite.c +index 504c1cbc255d..1ef7922b57b6 100644 +--- a/drivers/usb/gadget/composite.c ++++ b/drivers/usb/gadget/composite.c +@@ -1679,6 +1679,18 @@ composite_setup(struct usb_gadget *gadget, const struct usb_ctrlrequest *ctrl) + struct usb_function *f = NULL; + u8 endp; + ++ if (w_length > USB_COMP_EP0_BUFSIZ) { ++ if (ctrl->bRequestType == USB_DIR_OUT) { ++ goto done; ++ } else { ++ /* Cast away the const, we are going to overwrite on purpose. */ ++ __le16 *temp = (__le16 *)&ctrl->wLength; ++ ++ *temp = cpu_to_le16(USB_COMP_EP0_BUFSIZ); ++ w_length = USB_COMP_EP0_BUFSIZ; ++ } ++ } ++ + /* partial re-init of the response message; the function or the + * gadget might need to intercept e.g. a control-OUT completion + * when we delegate to it. +diff --git a/drivers/usb/gadget/legacy/dbgp.c b/drivers/usb/gadget/legacy/dbgp.c +index e1d566c9918a..e567afcb2794 100644 +--- a/drivers/usb/gadget/legacy/dbgp.c ++++ b/drivers/usb/gadget/legacy/dbgp.c +@@ -345,6 +345,19 @@ static int dbgp_setup(struct usb_gadget *gadget, + void *data = NULL; + u16 len = 0; + ++ if (length > DBGP_REQ_LEN) { ++ if (ctrl->bRequestType == USB_DIR_OUT) { ++ return err; ++ } else { ++ /* Cast away the const, we are going to overwrite on purpose. */ ++ __le16 *temp = (__le16 *)&ctrl->wLength; ++ ++ *temp = cpu_to_le16(DBGP_REQ_LEN); ++ length = DBGP_REQ_LEN; ++ } ++ } ++ ++ + if (request == USB_REQ_GET_DESCRIPTOR) { + switch (value>>8) { + case USB_DT_DEVICE: +diff --git a/drivers/usb/gadget/legacy/inode.c b/drivers/usb/gadget/legacy/inode.c +index 539220d7f5b6..0a4041552ed1 100644 +--- a/drivers/usb/gadget/legacy/inode.c ++++ b/drivers/usb/gadget/legacy/inode.c +@@ -110,6 +110,8 @@ enum ep0_state { + /* enough for the whole queue: most events invalidate others */ + #define N_EVENT 5 + ++#define RBUF_SIZE 256 ++ + struct dev_data { + spinlock_t lock; + refcount_t count; +@@ -144,7 +146,7 @@ struct dev_data { + struct dentry *dentry; + + /* except this scratch i/o buffer for ep0 */ +- u8 rbuf [256]; ++ u8 rbuf[RBUF_SIZE]; + }; + + static inline void get_dev (struct dev_data *data) +@@ -1334,6 +1336,18 @@ gadgetfs_setup (struct usb_gadget *gadget, const struct usb_ctrlrequest *ctrl) + u16 w_value = le16_to_cpu(ctrl->wValue); + u16 w_length = le16_to_cpu(ctrl->wLength); + ++ if (w_length > RBUF_SIZE) { ++ if (ctrl->bRequestType == USB_DIR_OUT) { ++ return value; ++ } else { ++ /* Cast away the const, we are going to overwrite on purpose. */ ++ __le16 *temp = (__le16 *)&ctrl->wLength; ++ ++ *temp = cpu_to_le16(RBUF_SIZE); ++ w_length = RBUF_SIZE; ++ } ++ } ++ + spin_lock (&dev->lock); + dev->setup_abort = 0; + if (dev->state == STATE_DEV_UNCONNECTED) { +-- +2.34.1 + diff --git a/debian/patches/bugfix/all/USB-gadget-zero-allocate-endpoint-0-buffers.patch b/debian/patches/bugfix/all/USB-gadget-zero-allocate-endpoint-0-buffers.patch new file mode 100644 index 000000000..9c53f8d0d --- /dev/null +++ b/debian/patches/bugfix/all/USB-gadget-zero-allocate-endpoint-0-buffers.patch @@ -0,0 +1,49 @@ +From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> +Date: Thu, 9 Dec 2021 19:02:15 +0100 +Subject: USB: gadget: zero allocate endpoint 0 buffers +Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit?id=6eea4ace62fa6414432692ee44f0c0a3d541d97a +Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2021-39685 + +commit 86ebbc11bb3f60908a51f3e41a17e3f477c2eaa3 upstream. + +Under some conditions, USB gadget devices can show allocated buffer +contents to a host. Fix this up by zero-allocating them so that any +extra data will all just be zeros. + +Reported-by: Szymon Heidrich <szymon.heidrich@gmail.com> +Tested-by: Szymon Heidrich <szymon.heidrich@gmail.com> +Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> +--- + drivers/usb/gadget/composite.c | 2 +- + drivers/usb/gadget/legacy/dbgp.c | 2 +- + 2 files changed, 2 insertions(+), 2 deletions(-) + +diff --git a/drivers/usb/gadget/composite.c b/drivers/usb/gadget/composite.c +index 1ef7922b57b6..284eea9f6e4d 100644 +--- a/drivers/usb/gadget/composite.c ++++ b/drivers/usb/gadget/composite.c +@@ -2221,7 +2221,7 @@ int composite_dev_prepare(struct usb_composite_driver *composite, + if (!cdev->req) + return -ENOMEM; + +- cdev->req->buf = kmalloc(USB_COMP_EP0_BUFSIZ, GFP_KERNEL); ++ cdev->req->buf = kzalloc(USB_COMP_EP0_BUFSIZ, GFP_KERNEL); + if (!cdev->req->buf) + goto fail; + +diff --git a/drivers/usb/gadget/legacy/dbgp.c b/drivers/usb/gadget/legacy/dbgp.c +index e567afcb2794..355bc7dab9d5 100644 +--- a/drivers/usb/gadget/legacy/dbgp.c ++++ b/drivers/usb/gadget/legacy/dbgp.c +@@ -137,7 +137,7 @@ static int dbgp_enable_ep_req(struct usb_ep *ep) + goto fail_1; + } + +- req->buf = kmalloc(DBGP_REQ_LEN, GFP_KERNEL); ++ req->buf = kzalloc(DBGP_REQ_LEN, GFP_KERNEL); + if (!req->buf) { + err = -ENOMEM; + stp = 2; +-- +2.34.1 + diff --git a/debian/patches/bugfix/all/atlantic-Fix-OOB-read-and-write-in-hw_atl_utils_fw_r.patch b/debian/patches/bugfix/all/atlantic-Fix-OOB-read-and-write-in-hw_atl_utils_fw_r.patch new file mode 100644 index 000000000..a2953a107 --- /dev/null +++ b/debian/patches/bugfix/all/atlantic-Fix-OOB-read-and-write-in-hw_atl_utils_fw_r.patch @@ -0,0 +1,91 @@ +From: Zekun Shen <bruceshenzk@gmail.com> +Date: Sat, 13 Nov 2021 22:24:40 -0500 +Subject: atlantic: Fix OOB read and write in hw_atl_utils_fw_rpc_wait +Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit?id=cec49b6dfdb0b9fefd0f17c32014223f73ee2605 +Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2021-43975 + +[ Upstream commit b922f622592af76b57cbc566eaeccda0b31a3496 ] + +This bug report shows up when running our research tools. The +reports is SOOB read, but it seems SOOB write is also possible +a few lines below. + +In details, fw.len and sw.len are inputs coming from io. A len +over the size of self->rpc triggers SOOB. The patch fixes the +bugs by adding sanity checks. + +The bugs are triggerable with compromised/malfunctioning devices. +They are potentially exploitable given they first leak up to +0xffff bytes and able to overwrite the region later. + +The patch is tested with QEMU emulater. +This is NOT tested with a real device. + +Attached is the log we found by fuzzing. + +BUG: KASAN: slab-out-of-bounds in + hw_atl_utils_fw_upload_dwords+0x393/0x3c0 [atlantic] +Read of size 4 at addr ffff888016260b08 by task modprobe/213 +CPU: 0 PID: 213 Comm: modprobe Not tainted 5.6.0 #1 +Call Trace: + dump_stack+0x76/0xa0 + print_address_description.constprop.0+0x16/0x200 + ? hw_atl_utils_fw_upload_dwords+0x393/0x3c0 [atlantic] + ? hw_atl_utils_fw_upload_dwords+0x393/0x3c0 [atlantic] + __kasan_report.cold+0x37/0x7c + ? aq_hw_read_reg_bit+0x60/0x70 [atlantic] + ? hw_atl_utils_fw_upload_dwords+0x393/0x3c0 [atlantic] + kasan_report+0xe/0x20 + hw_atl_utils_fw_upload_dwords+0x393/0x3c0 [atlantic] + hw_atl_utils_fw_rpc_call+0x95/0x130 [atlantic] + hw_atl_utils_fw_rpc_wait+0x176/0x210 [atlantic] + hw_atl_utils_mpi_create+0x229/0x2e0 [atlantic] + ? hw_atl_utils_fw_rpc_wait+0x210/0x210 [atlantic] + ? hw_atl_utils_initfw+0x9f/0x1c8 [atlantic] + hw_atl_utils_initfw+0x12a/0x1c8 [atlantic] + aq_nic_ndev_register+0x88/0x650 [atlantic] + ? aq_nic_ndev_init+0x235/0x3c0 [atlantic] + aq_pci_probe+0x731/0x9b0 [atlantic] + ? aq_pci_func_init+0xc0/0xc0 [atlantic] + local_pci_probe+0xd3/0x160 + pci_device_probe+0x23f/0x3e0 + +Reported-by: Brendan Dolan-Gavitt <brendandg@nyu.edu> +Signed-off-by: Zekun Shen <bruceshenzk@gmail.com> +Signed-off-by: David S. Miller <davem@davemloft.net> +Signed-off-by: Sasha Levin <sashal@kernel.org> +--- + .../ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c | 10 ++++++++++ + 1 file changed, 10 insertions(+) + +diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c +index 404cbf60d3f2..da1d185f6d22 100644 +--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c ++++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c +@@ -559,6 +559,11 @@ int hw_atl_utils_fw_rpc_wait(struct aq_hw_s *self, + goto err_exit; + + if (fw.len == 0xFFFFU) { ++ if (sw.len > sizeof(self->rpc)) { ++ printk(KERN_INFO "Invalid sw len: %x\n", sw.len); ++ err = -EINVAL; ++ goto err_exit; ++ } + err = hw_atl_utils_fw_rpc_call(self, sw.len); + if (err < 0) + goto err_exit; +@@ -567,6 +572,11 @@ int hw_atl_utils_fw_rpc_wait(struct aq_hw_s *self, + + if (rpc) { + if (fw.len) { ++ if (fw.len > sizeof(self->rpc)) { ++ printk(KERN_INFO "Invalid fw len: %x\n", fw.len); ++ err = -EINVAL; ++ goto err_exit; ++ } + err = + hw_atl_utils_fw_downld_dwords(self, + self->rpc_addr, +-- +2.34.1 + diff --git a/debian/patches/bugfix/all/bpf-fix-kernel-address-leakage-in-atomic-cmpxchg-s-r0-aux-reg.patch b/debian/patches/bugfix/all/bpf-fix-kernel-address-leakage-in-atomic-cmpxchg-s-r0-aux-reg.patch new file mode 100644 index 000000000..aad743511 --- /dev/null +++ b/debian/patches/bugfix/all/bpf-fix-kernel-address-leakage-in-atomic-cmpxchg-s-r0-aux-reg.patch @@ -0,0 +1,59 @@ +From a82fe085f344ef20b452cd5f481010ff96b5c4cd Mon Sep 17 00:00:00 2001 +From: Daniel Borkmann <daniel@iogearbox.net> +Date: Tue, 7 Dec 2021 11:02:02 +0000 +Subject: bpf: Fix kernel address leakage in atomic cmpxchg's r0 aux reg + +From: Daniel Borkmann <daniel@iogearbox.net> + +commit a82fe085f344ef20b452cd5f481010ff96b5c4cd upstream. + +The implementation of BPF_CMPXCHG on a high level has the following parameters: + + .-[old-val] .-[new-val] + BPF_R0 = cmpxchg{32,64}(DST_REG + insn->off, BPF_R0, SRC_REG) + `-[mem-loc] `-[old-val] + +Given a BPF insn can only have two registers (dst, src), the R0 is fixed and +used as an auxilliary register for input (old value) as well as output (returning +old value from memory location). While the verifier performs a number of safety +checks, it misses to reject unprivileged programs where R0 contains a pointer as +old value. + +Through brute-forcing it takes about ~16sec on my machine to leak a kernel pointer +with BPF_CMPXCHG. The PoC is basically probing for kernel addresses by storing the +guessed address into the map slot as a scalar, and using the map value pointer as +R0 while SRC_REG has a canary value to detect a matching address. + +Fix it by checking R0 for pointers, and reject if that's the case for unprivileged +programs. + +Fixes: 5ffa25502b5a ("bpf: Add instructions for atomic_[cmp]xchg") +Reported-by: Ryota Shiga (Flatt Security) +Acked-by: Brendan Jackman <jackmanb@google.com> +Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> +Signed-off-by: Alexei Starovoitov <ast@kernel.org> +Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> +--- + kernel/bpf/verifier.c | 9 ++++++++- + 1 file changed, 8 insertions(+), 1 deletion(-) + +--- a/kernel/bpf/verifier.c ++++ b/kernel/bpf/verifier.c +@@ -4386,9 +4386,16 @@ static int check_atomic(struct bpf_verif + + if (insn->imm == BPF_CMPXCHG) { + /* Check comparison of R0 with memory location */ +- err = check_reg_arg(env, BPF_REG_0, SRC_OP); ++ const u32 aux_reg = BPF_REG_0; ++ ++ err = check_reg_arg(env, aux_reg, SRC_OP); + if (err) + return err; ++ ++ if (is_pointer_value(env, aux_reg)) { ++ verbose(env, "R%d leaks addr into mem\n", aux_reg); ++ return -EACCES; ++ } + } + + if (is_pointer_value(env, insn->src_reg)) { diff --git a/debian/patches/bugfix/all/bpf-fix-kernel-address-leakage-in-atomic-fetch.patch b/debian/patches/bugfix/all/bpf-fix-kernel-address-leakage-in-atomic-fetch.patch new file mode 100644 index 000000000..d8035a87f --- /dev/null +++ b/debian/patches/bugfix/all/bpf-fix-kernel-address-leakage-in-atomic-fetch.patch @@ -0,0 +1,69 @@ +From 7d3baf0afa3aa9102d6a521a8e4c41888bb79882 Mon Sep 17 00:00:00 2001 +From: Daniel Borkmann <daniel@iogearbox.net> +Date: Tue, 7 Dec 2021 12:51:56 +0000 +Subject: bpf: Fix kernel address leakage in atomic fetch + +From: Daniel Borkmann <daniel@iogearbox.net> + +commit 7d3baf0afa3aa9102d6a521a8e4c41888bb79882 upstream. + +The change in commit 37086bfdc737 ("bpf: Propagate stack bounds to registers +in atomics w/ BPF_FETCH") around check_mem_access() handling is buggy since +this would allow for unprivileged users to leak kernel pointers. For example, +an atomic fetch/and with -1 on a stack destination which holds a spilled +pointer will migrate the spilled register type into a scalar, which can then +be exported out of the program (since scalar != pointer) by dumping it into +a map value. + +The original implementation of XADD was preventing this situation by using +a double call to check_mem_access() one with BPF_READ and a subsequent one +with BPF_WRITE, in both cases passing -1 as a placeholder value instead of +register as per XADD semantics since it didn't contain a value fetch. The +BPF_READ also included a check in check_stack_read_fixed_off() which rejects +the program if the stack slot is of __is_pointer_value() if dst_regno < 0. +The latter is to distinguish whether we're dealing with a regular stack spill/ +fill or some arithmetical operation which is disallowed on non-scalars, see +also 6e7e63cbb023 ("bpf: Forbid XADD on spilled pointers for unprivileged +users") for more context on check_mem_access() and its handling of placeholder +value -1. + +One minimally intrusive option to fix the leak is for the BPF_FETCH case to +initially check the BPF_READ case via check_mem_access() with -1 as register, +followed by the actual load case with non-negative load_reg to propagate +stack bounds to registers. + +Fixes: 37086bfdc737 ("bpf: Propagate stack bounds to registers in atomics w/ BPF_FETCH") +Reported-by: <n4ke4mry@gmail.com> +Acked-by: Brendan Jackman <jackmanb@google.com> +Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> +Signed-off-by: Alexei Starovoitov <ast@kernel.org> +Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> +--- + kernel/bpf/verifier.c | 12 +++++++++--- + 1 file changed, 9 insertions(+), 3 deletions(-) + +--- a/kernel/bpf/verifier.c ++++ b/kernel/bpf/verifier.c +@@ -4417,13 +4417,19 @@ static int check_atomic(struct bpf_verif + load_reg = -1; + } + +- /* check whether we can read the memory */ ++ /* Check whether we can read the memory, with second call for fetch ++ * case to simulate the register fill. ++ */ + err = check_mem_access(env, insn_idx, insn->dst_reg, insn->off, +- BPF_SIZE(insn->code), BPF_READ, load_reg, true); ++ BPF_SIZE(insn->code), BPF_READ, -1, true); ++ if (!err && load_reg >= 0) ++ err = check_mem_access(env, insn_idx, insn->dst_reg, insn->off, ++ BPF_SIZE(insn->code), BPF_READ, load_reg, ++ true); + if (err) + return err; + +- /* check whether we can write into the same memory */ ++ /* Check whether we can write into the same memory. */ + err = check_mem_access(env, insn_idx, insn->dst_reg, insn->off, + BPF_SIZE(insn->code), BPF_WRITE, -1, true); + if (err) diff --git a/debian/patches/bugfix/all/bpf-fix-signed-bounds-propagation-after-mov32.patch b/debian/patches/bugfix/all/bpf-fix-signed-bounds-propagation-after-mov32.patch new file mode 100644 index 000000000..f86318606 --- /dev/null +++ b/debian/patches/bugfix/all/bpf-fix-signed-bounds-propagation-after-mov32.patch @@ -0,0 +1,97 @@ +From 3cf2b61eb06765e27fec6799292d9fb46d0b7e60 Mon Sep 17 00:00:00 2001 +From: Daniel Borkmann <daniel@iogearbox.net> +Date: Wed, 15 Dec 2021 22:02:19 +0000 +Subject: bpf: Fix signed bounds propagation after mov32 + +From: Daniel Borkmann <daniel@iogearbox.net> + +commit 3cf2b61eb06765e27fec6799292d9fb46d0b7e60 upstream. + +For the case where both s32_{min,max}_value bounds are positive, the +__reg_assign_32_into_64() directly propagates them to their 64 bit +counterparts, otherwise it pessimises them into [0,u32_max] universe and +tries to refine them later on by learning through the tnum as per comment +in mentioned function. However, that does not always happen, for example, +in mov32 operation we call zext_32_to_64(dst_reg) which invokes the +__reg_assign_32_into_64() as is without subsequent bounds update as +elsewhere thus no refinement based on tnum takes place. + +Thus, not calling into the __update_reg_bounds() / __reg_deduce_bounds() / +__reg_bound_offset() triplet as we do, for example, in case of ALU ops via +adjust_scalar_min_max_vals(), will lead to more pessimistic bounds when +dumping the full register state: + +Before fix: + + 0: (b4) w0 = -1 + 1: R0_w=invP4294967295 + (id=0,imm=ffffffff, + smin_value=4294967295,smax_value=4294967295, + umin_value=4294967295,umax_value=4294967295, + var_off=(0xffffffff; 0x0), + s32_min_value=-1,s32_max_value=-1, + u32_min_value=-1,u32_max_value=-1) + + 1: (bc) w0 = w0 + 2: R0_w=invP4294967295 + (id=0,imm=ffffffff, + smin_value=0,smax_value=4294967295, + umin_value=4294967295,umax_value=4294967295, + var_off=(0xffffffff; 0x0), + s32_min_value=-1,s32_max_value=-1, + u32_min_value=-1,u32_max_value=-1) + +Technically, the smin_value=0 and smax_value=4294967295 bounds are not +incorrect, but given the register is still a constant, they break assumptions +about const scalars that smin_value == smax_value and umin_value == umax_value. + +After fix: + + 0: (b4) w0 = -1 + 1: R0_w=invP4294967295 + (id=0,imm=ffffffff, + smin_value=4294967295,smax_value=4294967295, + umin_value=4294967295,umax_value=4294967295, + var_off=(0xffffffff; 0x0), + s32_min_value=-1,s32_max_value=-1, + u32_min_value=-1,u32_max_value=-1) + + 1: (bc) w0 = w0 + 2: R0_w=invP4294967295 + (id=0,imm=ffffffff, + smin_value=4294967295,smax_value=4294967295, + umin_value=4294967295,umax_value=4294967295, + var_off=(0xffffffff; 0x0), + s32_min_value=-1,s32_max_value=-1, + u32_min_value=-1,u32_max_value=-1) + +Without the smin_value == smax_value and umin_value == umax_value invariant +being intact for const scalars, it is possible to leak out kernel pointers +from unprivileged user space if the latter is enabled. For example, when such +registers are involved in pointer arithmtics, then adjust_ptr_min_max_vals() +will taint the destination register into an unknown scalar, and the latter +can be exported and stored e.g. into a BPF map value. + +Fixes: 3f50f132d840 ("bpf: Verifier, do explicit ALU32 bounds tracking") +Reported-by: Kuee K1r0a <liulin063@gmail.com> +Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> +Reviewed-by: John Fastabend <john.fastabend@gmail.com> +Acked-by: Alexei Starovoitov <ast@kernel.org> +Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> +--- + kernel/bpf/verifier.c | 4 ++++ + 1 file changed, 4 insertions(+) + +--- a/kernel/bpf/verifier.c ++++ b/kernel/bpf/verifier.c +@@ -8120,6 +8120,10 @@ static int check_alu_op(struct bpf_verif + insn->dst_reg); + } + zext_32_to_64(dst_reg); ++ ++ __update_reg_bounds(dst_reg); ++ __reg_deduce_bounds(dst_reg); ++ __reg_bound_offset(dst_reg); + } + } else { + /* case: R = imm diff --git a/debian/patches/bugfix/all/bpf-make-32-64-bounds-propagation-slightly-more-robust.patch b/debian/patches/bugfix/all/bpf-make-32-64-bounds-propagation-slightly-more-robust.patch new file mode 100644 index 000000000..6eb5c51fa --- /dev/null +++ b/debian/patches/bugfix/all/bpf-make-32-64-bounds-propagation-slightly-more-robust.patch @@ -0,0 +1,62 @@ +From e572ff80f05c33cd0cb4860f864f5c9c044280b6 Mon Sep 17 00:00:00 2001 +From: Daniel Borkmann <daniel@iogearbox.net> +Date: Wed, 15 Dec 2021 22:28:48 +0000 +Subject: bpf: Make 32->64 bounds propagation slightly more robust + +From: Daniel Borkmann <daniel@iogearbox.net> + +commit e572ff80f05c33cd0cb4860f864f5c9c044280b6 upstream. + +Make the bounds propagation in __reg_assign_32_into_64() slightly more +robust and readable by aligning it similarly as we did back in the +__reg_combine_64_into_32() counterpart. Meaning, only propagate or +pessimize them as a smin/smax pair. + +Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> +Reviewed-by: John Fastabend <john.fastabend@gmail.com> +Acked-by: Alexei Starovoitov <ast@kernel.org> +Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> +--- + kernel/bpf/verifier.c | 24 +++++++++++++++--------- + 1 file changed, 15 insertions(+), 9 deletions(-) + +--- a/kernel/bpf/verifier.c ++++ b/kernel/bpf/verifier.c +@@ -1358,22 +1358,28 @@ static void __reg_bound_offset(struct bp + reg->var_off = tnum_or(tnum_clear_subreg(var64_off), var32_off); + } + ++static bool __reg32_bound_s64(s32 a) ++{ ++ return a >= 0 && a <= S32_MAX; ++} ++ + static void __reg_assign_32_into_64(struct bpf_reg_state *reg) + { + reg->umin_value = reg->u32_min_value; + reg->umax_value = reg->u32_max_value; +- /* Attempt to pull 32-bit signed bounds into 64-bit bounds +- * but must be positive otherwise set to worse case bounds +- * and refine later from tnum. ++ ++ /* Attempt to pull 32-bit signed bounds into 64-bit bounds but must ++ * be positive otherwise set to worse case bounds and refine later ++ * from tnum. + */ +- if (reg->s32_min_value >= 0 && reg->s32_max_value >= 0) +- reg->smax_value = reg->s32_max_value; +- else +- reg->smax_value = U32_MAX; +- if (reg->s32_min_value >= 0) ++ if (__reg32_bound_s64(reg->s32_min_value) && ++ __reg32_bound_s64(reg->s32_max_value)) { + reg->smin_value = reg->s32_min_value; +- else ++ reg->smax_value = reg->s32_max_value; ++ } else { + reg->smin_value = 0; ++ reg->smax_value = U32_MAX; ++ } + } + + static void __reg_combine_32_into_64(struct bpf_reg_state *reg) diff --git a/debian/patches/bugfix/all/ext4-limit-the-number-of-blocks-in-one-ADD_RANGE-TLV.patch b/debian/patches/bugfix/all/ext4-limit-the-number-of-blocks-in-one-ADD_RANGE-TLV.patch deleted file mode 100644 index 047eebefd..000000000 --- a/debian/patches/bugfix/all/ext4-limit-the-number-of-blocks-in-one-ADD_RANGE-TLV.patch +++ /dev/null @@ -1,61 +0,0 @@ -From: Hou Tao <houtao1@huawei.com> -Date: Fri, 20 Aug 2021 12:45:05 +0800 -Subject: ext4: limit the number of blocks in one ADD_RANGE TLV -Origin: https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git/commit/?h=dev&id=a2c2f0826e2b75560b31daf1cd9a755ab93cf4c6 -Bug-Debian: https://bugs.debian.org/995425 - -Now EXT4_FC_TAG_ADD_RANGE uses ext4_extent to track the -newly-added blocks, but the limit on the max value of -ee_len field is ignored, and it can lead to BUG_ON as -shown below when running command "fallocate -l 128M file" -on a fast_commit-enabled fs: - - kernel BUG at fs/ext4/ext4_extents.h:199! - invalid opcode: 0000 [#1] SMP PTI - CPU: 3 PID: 624 Comm: fallocate Not tainted 5.14.0-rc6+ #1 - Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) - RIP: 0010:ext4_fc_write_inode_data+0x1f3/0x200 - Call Trace: - ? ext4_fc_write_inode+0xf2/0x150 - ext4_fc_commit+0x93b/0xa00 - ? ext4_fallocate+0x1ad/0x10d0 - ext4_sync_file+0x157/0x340 - ? ext4_sync_file+0x157/0x340 - vfs_fsync_range+0x49/0x80 - do_fsync+0x3d/0x70 - __x64_sys_fsync+0x14/0x20 - do_syscall_64+0x3b/0xc0 - entry_SYSCALL_64_after_hwframe+0x44/0xae - -Simply fixing it by limiting the number of blocks -in one EXT4_FC_TAG_ADD_RANGE TLV. - -Fixes: aa75f4d3daae ("ext4: main fast-commit commit path") -Cc: stable@kernel.org -Signed-off-by: Hou Tao <houtao1@huawei.com> -Signed-off-by: Theodore Ts'o <tytso@mit.edu> -Link: https://lore.kernel.org/r/20210820044505.474318-1-houtao1@huawei.com ---- - fs/ext4/fast_commit.c | 6 ++++++ - 1 file changed, 6 insertions(+) - -diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c -index 8e610a381862..8ea5a81e6554 100644 ---- a/fs/ext4/fast_commit.c -+++ b/fs/ext4/fast_commit.c -@@ -892,6 +892,12 @@ static int ext4_fc_write_inode_data(struct inode *inode, u32 *crc) - sizeof(lrange), (u8 *)&lrange, crc)) - return -ENOSPC; - } else { -+ unsigned int max = (map.m_flags & EXT4_MAP_UNWRITTEN) ? -+ EXT_UNWRITTEN_MAX_LEN : EXT_INIT_MAX_LEN; -+ -+ /* Limit the number of blocks in one extent */ -+ map.m_len = min(max, map.m_len); -+ - fc_ext.fc_ino = cpu_to_le32(inode->i_ino); - ex = (struct ext4_extent *)&fc_ext.fc_ex; - ex->ee_block = cpu_to_le32(map.m_lblk); --- -2.33.0 - diff --git a/debian/patches/bugfix/all/fget-check-that-the-fd-still-exists-after-getting-a-.patch b/debian/patches/bugfix/all/fget-check-that-the-fd-still-exists-after-getting-a-.patch new file mode 100644 index 000000000..9680bbb41 --- /dev/null +++ b/debian/patches/bugfix/all/fget-check-that-the-fd-still-exists-after-getting-a-.patch @@ -0,0 +1,67 @@ +From: Linus Torvalds <torvalds@linux-foundation.org> +Date: Wed, 1 Dec 2021 10:06:14 -0800 +Subject: fget: check that the fd still exists after getting a ref to it +Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit?id=6fe4eadd54da3040cf6f6579ae157ae1395dc0f8 +Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2021-4083 + +commit 054aa8d439b9185d4f5eb9a90282d1ce74772969 upstream. + +Jann Horn points out that there is another possible race wrt Unix domain +socket garbage collection, somewhat reminiscent of the one fixed in +commit cbcf01128d0a ("af_unix: fix garbage collect vs MSG_PEEK"). + +See the extended comment about the garbage collection requirements added +to unix_peek_fds() by that commit for details. + +The race comes from how we can locklessly look up a file descriptor just +as it is in the process of being closed, and with the right artificial +timing (Jann added a few strategic 'mdelay(500)' calls to do that), the +Unix domain socket garbage collector could see the reference count +decrement of the close() happen before fget() took its reference to the +file and the file was attached onto a new file descriptor. + +This is all (intentionally) correct on the 'struct file *' side, with +RCU lookups and lockless reference counting very much part of the +design. Getting that reference count out of order isn't a problem per +se. + +But the garbage collector can get confused by seeing this situation of +having seen a file not having any remaining external references and then +seeing it being attached to an fd. + +In commit cbcf01128d0a ("af_unix: fix garbage collect vs MSG_PEEK") the +fix was to serialize the file descriptor install with the garbage +collector by taking and releasing the unix_gc_lock. + +That's not really an option here, but since this all happens when we are +in the process of looking up a file descriptor, we can instead simply +just re-check that the file hasn't been closed in the meantime, and just +re-do the lookup if we raced with a concurrent close() of the same file +descriptor. + +Reported-and-tested-by: Jann Horn <jannh@google.com> +Acked-by: Miklos Szeredi <mszeredi@redhat.com> +Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> +Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> +--- + fs/file.c | 4 ++++ + 1 file changed, 4 insertions(+) + +diff --git a/fs/file.c b/fs/file.c +index 8627dacfc424..ad4a8bf3cf10 100644 +--- a/fs/file.c ++++ b/fs/file.c +@@ -858,6 +858,10 @@ static struct file *__fget_files(struct files_struct *files, unsigned int fd, + file = NULL; + else if (!get_file_rcu_many(file, refs)) + goto loop; ++ else if (files_lookup_fd_raw(files, fd) != file) { ++ fput_many(file, refs); ++ goto loop; ++ } + } + rcu_read_unlock(); + +-- +2.34.1 + diff --git a/debian/patches/bugfix/all/firmware-remove-redundant-log-messages-from-drivers.patch b/debian/patches/bugfix/all/firmware-remove-redundant-log-messages-from-drivers.patch index f1744bff1..701376417 100644 --- a/debian/patches/bugfix/all/firmware-remove-redundant-log-messages-from-drivers.patch +++ b/debian/patches/bugfix/all/firmware-remove-redundant-log-messages-from-drivers.patch @@ -181,7 +181,7 @@ Index: linux/drivers/dma/imx-sdma.c =================================================================== --- linux.orig/drivers/dma/imx-sdma.c +++ linux/drivers/dma/imx-sdma.c -@@ -1756,11 +1756,8 @@ static void sdma_load_firmware(const str +@@ -1800,11 +1800,8 @@ static void sdma_load_firmware(const str const struct sdma_script_start_addrs *addr; unsigned short *ram_code; @@ -231,7 +231,7 @@ Index: linux/drivers/gpu/drm/r128/r128_cce.c =================================================================== --- linux.orig/drivers/gpu/drm/r128/r128_cce.c +++ linux/drivers/gpu/drm/r128/r128_cce.c -@@ -162,11 +162,8 @@ static int r128_cce_load_microcode(drm_r +@@ -161,11 +161,8 @@ static int r128_cce_load_microcode(drm_r } rc = request_firmware(&fw, FIRMWARE_NAME, &pdev->dev); platform_device_unregister(pdev); @@ -262,7 +262,7 @@ Index: linux/drivers/gpu/drm/radeon/r100.c =================================================================== --- linux.orig/drivers/gpu/drm/radeon/r100.c +++ linux/drivers/gpu/drm/radeon/r100.c -@@ -1047,9 +1047,7 @@ static int r100_cp_init_microcode(struct +@@ -1056,9 +1056,7 @@ static int r100_cp_init_microcode(struct } err = request_firmware(&rdev->me_fw, fw_name, rdev->dev); @@ -1289,7 +1289,7 @@ Index: linux/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c =================================================================== --- linux.orig/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c +++ linux/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c -@@ -13471,11 +13471,8 @@ static int bnx2x_init_firmware(struct bn +@@ -13420,11 +13420,8 @@ static int bnx2x_init_firmware(struct bn BNX2X_DEV_INFO("Loading %s\n", fw_file_name); rc = request_firmware(&bp->firmware, fw_file_name, &bp->pdev->dev); @@ -1306,7 +1306,7 @@ Index: linux/drivers/net/ethernet/broadcom/tg3.c =================================================================== --- linux.orig/drivers/net/ethernet/broadcom/tg3.c +++ linux/drivers/net/ethernet/broadcom/tg3.c -@@ -11407,11 +11407,8 @@ static int tg3_request_firmware(struct t +@@ -11404,11 +11404,8 @@ static int tg3_request_firmware(struct t { const struct tg3_firmware_hdr *fw_hdr; @@ -1750,25 +1750,6 @@ Index: linux/drivers/net/wireless/intersil/p54/p54usb.c } complete(&priv->fw_wait_load); -Index: linux/drivers/net/wireless/intersil/prism54/islpci_dev.c -=================================================================== ---- linux.orig/drivers/net/wireless/intersil/prism54/islpci_dev.c -+++ linux/drivers/net/wireless/intersil/prism54/islpci_dev.c -@@ -80,12 +80,9 @@ isl_upload_firmware(islpci_private *priv - const u32 *fw_ptr; - - rc = request_firmware(&fw_entry, priv->firmware, PRISM_FW_PDEV); -- if (rc) { -- printk(KERN_ERR -- "%s: request_firmware() failed for '%s'\n", -- "prism54", priv->firmware); -+ if (rc) - return rc; -- } -+ - /* prepare the Direct Memory Base register */ - reg = ISL38XX_DEV_FIRMWARE_ADDRES; - Index: linux/drivers/net/wireless/marvell/libertas_tf/if_usb.c =================================================================== --- linux.orig/drivers/net/wireless/marvell/libertas_tf/if_usb.c @@ -2087,7 +2068,7 @@ Index: linux/drivers/scsi/qla1280.c =================================================================== --- linux.orig/drivers/scsi/qla1280.c +++ linux/drivers/scsi/qla1280.c -@@ -1514,8 +1514,6 @@ qla1280_request_firmware(struct scsi_qla +@@ -1513,8 +1513,6 @@ qla1280_request_firmware(struct scsi_qla err = request_firmware(&fw, fwname, &ha->pdev->dev); if (err) { @@ -2100,7 +2081,7 @@ Index: linux/drivers/scsi/qla2xxx/qla_init.c =================================================================== --- linux.orig/drivers/scsi/qla2xxx/qla_init.c +++ linux/drivers/scsi/qla2xxx/qla_init.c -@@ -8031,10 +8031,6 @@ qla2x00_load_risc(scsi_qla_host_t *vha, +@@ -8187,10 +8187,6 @@ qla2x00_load_risc(scsi_qla_host_t *vha, /* Load firmware blob. */ blob = qla2x00_request_firmware(vha); if (!blob) { @@ -2111,7 +2092,7 @@ Index: linux/drivers/scsi/qla2xxx/qla_init.c return QLA_FUNCTION_FAILED; } -@@ -8137,9 +8133,6 @@ qla24xx_load_risc_blob(scsi_qla_host_t * +@@ -8293,9 +8289,6 @@ qla24xx_load_risc_blob(scsi_qla_host_t * blob = qla2x00_request_firmware(vha); if (!blob) { @@ -2125,7 +2106,7 @@ Index: linux/drivers/scsi/qla2xxx/qla_nx.c =================================================================== --- linux.orig/drivers/scsi/qla2xxx/qla_nx.c +++ linux/drivers/scsi/qla2xxx/qla_nx.c -@@ -2429,11 +2429,8 @@ try_blob_fw: +@@ -2427,11 +2427,8 @@ try_blob_fw: /* Load firmware blob. */ blob = ha->hablob = qla2x00_request_firmware(vha); @@ -2142,7 +2123,7 @@ Index: linux/drivers/scsi/qla2xxx/qla_os.c =================================================================== --- linux.orig/drivers/scsi/qla2xxx/qla_os.c +++ linux/drivers/scsi/qla2xxx/qla_os.c -@@ -7403,8 +7403,6 @@ qla2x00_request_firmware(scsi_qla_host_t +@@ -7552,8 +7552,6 @@ qla2x00_request_firmware(scsi_qla_host_t goto out; if (request_firmware(&blob->fw, blob->name, &ha->pdev->dev)) { @@ -2188,7 +2169,7 @@ Index: linux/drivers/staging/rtl8712/hal_init.c =================================================================== --- linux.orig/drivers/staging/rtl8712/hal_init.c +++ linux/drivers/staging/rtl8712/hal_init.c -@@ -73,8 +73,6 @@ int rtl871x_load_fw(struct _adapter *pad +@@ -72,8 +72,6 @@ int rtl871x_load_fw(struct _adapter *pad dev_info(dev, "r8712u: Loading firmware from \"%s\"\n", firmware_file); rc = request_firmware_nowait(THIS_MODULE, 1, firmware_file, dev, GFP_KERNEL, padapter, rtl871x_load_fw_cb); @@ -2201,7 +2182,7 @@ Index: linux/drivers/staging/vt6656/main_usb.c =================================================================== --- linux.orig/drivers/staging/vt6656/main_usb.c +++ linux/drivers/staging/vt6656/main_usb.c -@@ -109,11 +109,8 @@ static int vnt_download_firmware(struct +@@ -107,11 +107,8 @@ static int vnt_download_firmware(struct dev_dbg(dev, "---->Download firmware\n"); ret = request_firmware(&fw, FIRMWARE_NAME, dev); @@ -2547,7 +2528,7 @@ Index: linux/sound/isa/sscape.c =================================================================== --- linux.orig/sound/isa/sscape.c +++ linux/sound/isa/sscape.c -@@ -531,10 +531,8 @@ static int sscape_upload_bootblock(struc +@@ -520,10 +520,8 @@ static int sscape_upload_bootblock(struc int ret; ret = request_firmware(&init_fw, "scope.cod", card->dev); @@ -2559,7 +2540,7 @@ Index: linux/sound/isa/sscape.c ret = upload_dma_data(sscape, init_fw->data, init_fw->size); release_firmware(init_fw); -@@ -571,11 +569,8 @@ static int sscape_upload_microcode(struc +@@ -560,11 +558,8 @@ static int sscape_upload_microcode(struc snprintf(name, sizeof(name), "sndscape.co%d", version); err = request_firmware(&init_fw, name, card->dev); @@ -2605,7 +2586,7 @@ Index: linux/sound/pci/cs46xx/cs46xx_lib.c =================================================================== --- linux.orig/sound/pci/cs46xx/cs46xx_lib.c +++ linux/sound/pci/cs46xx/cs46xx_lib.c -@@ -3246,11 +3246,8 @@ int snd_cs46xx_start_dsp(struct snd_cs46 +@@ -3199,11 +3199,8 @@ int snd_cs46xx_start_dsp(struct snd_cs46 #ifdef CONFIG_SND_CS46XX_NEW_DSP for (i = 0; i < CS46XX_DSP_MODULES; i++) { err = load_firmware(chip, &chip->modules[i], module_names[i]); @@ -2655,7 +2636,7 @@ Index: linux/sound/pci/hda/hda_intel.c =================================================================== --- linux.orig/sound/pci/hda/hda_intel.c +++ linux/sound/pci/hda/hda_intel.c -@@ -2033,8 +2033,6 @@ static void azx_firmware_cb(const struct +@@ -2018,8 +2018,6 @@ static void azx_firmware_cb(const struct if (fw) chip->fw = fw; @@ -2668,14 +2649,14 @@ Index: linux/sound/pci/korg1212/korg1212.c =================================================================== --- linux.orig/sound/pci/korg1212/korg1212.c +++ linux/sound/pci/korg1212/korg1212.c -@@ -2338,7 +2338,6 @@ static int snd_korg1212_create(struct sn +@@ -2258,7 +2258,6 @@ static int snd_korg1212_create(struct sn err = request_firmware(&dsp_code, "korg/k1212.dsp", &pci->dev); if (err < 0) { - snd_printk(KERN_ERR "firmware not available\n"); - snd_korg1212_free(korg1212); return err; } + Index: linux/sound/pci/mixart/mixart_hwdep.c =================================================================== --- linux.orig/sound/pci/mixart/mixart_hwdep.c @@ -2732,7 +2713,7 @@ Index: linux/sound/pci/rme9652/hdsp.c =================================================================== --- linux.orig/sound/pci/rme9652/hdsp.c +++ linux/sound/pci/rme9652/hdsp.c -@@ -5209,11 +5209,8 @@ static int hdsp_request_fw_loader(struct +@@ -5196,11 +5196,8 @@ static int hdsp_request_fw_loader(struct return -EINVAL; } diff --git a/debian/patches/bugfix/all/fuse-release-pipe-buf-after-last-use.patch b/debian/patches/bugfix/all/fuse-release-pipe-buf-after-last-use.patch new file mode 100644 index 000000000..768dfdffb --- /dev/null +++ b/debian/patches/bugfix/all/fuse-release-pipe-buf-after-last-use.patch @@ -0,0 +1,51 @@ +From: Miklos Szeredi <mszeredi@redhat.com> +Date: Thu, 25 Nov 2021 14:05:18 +0100 +Subject: fuse: release pipe buf after last use +Origin: https://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git/commit/?h=for-next&id=473441720c8616dfaf4451f9c7ea14f0eb5e5d65 +Bug-Debian: https://bugs.debian.org/1000504 + +Checking buf->flags should be done before the pipe_buf_release() is called +on the pipe buffer, since releasing the buffer might modify the flags. + +This is exactly what page_cache_pipe_buf_release() does, and which results +in the same VM_BUG_ON_PAGE(PageLRU(page)) that the original patch was +trying to fix. + +Reported-by: Justin Forbes <jmforbes@linuxtx.org> +Fixes: 712a951025c0 ("fuse: fix page stealing") +Cc: <stable@vger.kernel.org> # v2.6.35 +Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> +--- + fs/fuse/dev.c | 10 +++++----- + 1 file changed, 5 insertions(+), 5 deletions(-) + +diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c +index 79f7eda49e06..cd54a529460d 100644 +--- a/fs/fuse/dev.c ++++ b/fs/fuse/dev.c +@@ -847,17 +847,17 @@ static int fuse_try_move_page(struct fuse_copy_state *cs, struct page **pagep) + + replace_page_cache_page(oldpage, newpage); + ++ get_page(newpage); ++ ++ if (!(buf->flags & PIPE_BUF_FLAG_LRU)) ++ lru_cache_add(newpage); ++ + /* + * Release while we have extra ref on stolen page. Otherwise + * anon_pipe_buf_release() might think the page can be reused. + */ + pipe_buf_release(cs->pipe, buf); + +- get_page(newpage); +- +- if (!(buf->flags & PIPE_BUF_FLAG_LRU)) +- lru_cache_add(newpage); +- + err = 0; + spin_lock(&cs->req->waitq.lock); + if (test_bit(FR_ABORTED, &cs->req->flags)) +-- +2.34.0 + diff --git a/debian/patches/bugfix/all/nfsd-fix-use-after-free-due-to-delegation-race.patch b/debian/patches/bugfix/all/nfsd-fix-use-after-free-due-to-delegation-race.patch new file mode 100644 index 000000000..34584d8b5 --- /dev/null +++ b/debian/patches/bugfix/all/nfsd-fix-use-after-free-due-to-delegation-race.patch @@ -0,0 +1,67 @@ +From: "J. Bruce Fields" <bfields@redhat.com> +Date: Mon, 29 Nov 2021 15:08:00 -0500 +Subject: nfsd: fix use-after-free due to delegation race +Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit?id=148c816f10fd11df27ca6a9b3238cdd42fa72cd3 +Bug-Debian: https://bugs.debian.org/988044 + +commit 548ec0805c399c65ed66c6641be467f717833ab5 upstream. + +A delegation break could arrive as soon as we've called vfs_setlease. A +delegation break runs a callback which immediately (in +nfsd4_cb_recall_prepare) adds the delegation to del_recall_lru. If we +then exit nfs4_set_delegation without hashing the delegation, it will be +freed as soon as the callback is done with it, without ever being +removed from del_recall_lru. + +Symptoms show up later as use-after-free or list corruption warnings, +usually in the laundromat thread. + +I suspect aba2072f4523 "nfsd: grant read delegations to clients holding +writes" made this bug easier to hit, but I looked as far back as v3.0 +and it looks to me it already had the same problem. So I'm not sure +where the bug was introduced; it may have been there from the beginning. + +Cc: stable@vger.kernel.org +Signed-off-by: J. Bruce Fields <bfields@redhat.com> +Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> +--- + fs/nfsd/nfs4state.c | 9 +++++++-- + 1 file changed, 7 insertions(+), 2 deletions(-) + +diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c +index 3f4027a5de88..61301affb4c1 100644 +--- a/fs/nfsd/nfs4state.c ++++ b/fs/nfsd/nfs4state.c +@@ -1207,6 +1207,11 @@ hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp) + return 0; + } + ++static bool delegation_hashed(struct nfs4_delegation *dp) ++{ ++ return !(list_empty(&dp->dl_perfile)); ++} ++ + static bool + unhash_delegation_locked(struct nfs4_delegation *dp) + { +@@ -1214,7 +1219,7 @@ unhash_delegation_locked(struct nfs4_delegation *dp) + + lockdep_assert_held(&state_lock); + +- if (list_empty(&dp->dl_perfile)) ++ if (!delegation_hashed(dp)) + return false; + + dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID; +@@ -4598,7 +4603,7 @@ static void nfsd4_cb_recall_prepare(struct nfsd4_callback *cb) + * queued for a lease break. Don't queue it again. + */ + spin_lock(&state_lock); +- if (dp->dl_time == 0) { ++ if (delegation_hashed(dp) && dp->dl_time == 0) { + dp->dl_time = ktime_get_boottime_seconds(); + list_add_tail(&dp->dl_recall_lru, &nn->del_recall_lru); + } +-- +2.34.1 + diff --git a/debian/patches/bugfix/all/partially-revert-usb-kconfig-using-select-for-usb_co.patch b/debian/patches/bugfix/all/partially-revert-usb-kconfig-using-select-for-usb_co.patch deleted file mode 100644 index 268c99151..000000000 --- a/debian/patches/bugfix/all/partially-revert-usb-kconfig-using-select-for-usb_co.patch +++ /dev/null @@ -1,30 +0,0 @@ -From: Ben Hutchings <ben@decadent.org.uk> -Date: Wed, 11 Jan 2017 04:30:40 +0000 -Subject: Partially revert "usb: Kconfig: using select for USB_COMMON dependency" -Forwarded: https://marc.info/?l=linux-usb&m=149248300414300 - -This reverts commit cb9c1cfc86926d0e86d19c8e34f6c23458cd3478 for -USB_LED_TRIG. This config symbol has bool type and enables extra code -in usb_common itself, not a separate driver. Enabling it should not -force usb_common to be built-in! - -Fixes: cb9c1cfc8692 ("usb: Kconfig: using select for USB_COMMON dependency") -Signed-off-by: Ben Hutchings <ben@decadent.org.uk> ---- - drivers/usb/common/Kconfig | 3 +-- - 1 file changed, 1 insertion(+), 2 deletions(-) - -diff --git a/drivers/usb/common/Kconfig b/drivers/usb/common/Kconfig -index d611477aae41..196f4a397587 100644 ---- a/drivers/usb/common/Kconfig -+++ b/drivers/usb/common/Kconfig -@@ -6,8 +6,7 @@ config USB_COMMON - - config USB_LED_TRIG - bool "USB LED Triggers" -- depends on LEDS_CLASS && LEDS_TRIGGERS -- select USB_COMMON -+ depends on LEDS_CLASS && USB_COMMON && LEDS_TRIGGERS - help - This option adds LED triggers for USB host and/or gadget activity. - diff --git a/debian/patches/bugfix/all/perf-srcline-Use-long-running-addr2line-per-DSO.patch b/debian/patches/bugfix/all/perf-srcline-Use-long-running-addr2line-per-DSO.patch new file mode 100644 index 000000000..f1996e947 --- /dev/null +++ b/debian/patches/bugfix/all/perf-srcline-Use-long-running-addr2line-per-DSO.patch @@ -0,0 +1,459 @@ +From: Tony Garnock-Jones <tonyg@leastfixedpoint.com> +Date: Thu, 16 Sep 2021 14:09:39 +0200 +Subject: perf srcline: Use long-running addr2line per DSO +Origin: https://git.kernel.org/linus/be8ecc57f180415e8a7c1cc5620c5236be2a7e56 +Bug-Debian: https://bugs.debian.org/911815 + +Invoking addr2line in a separate subprocess, one for each required +lookup, takes a terribly long time. + +This patch introduces a long-running addr2line process for each DSO, +*DRAMATICALLY* speeding up runs of perf. + +What used to take tens of minutes now takes tens of seconds. + +Debian bug report about this issue: + + https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=911815 + +Signed-off-by: Tony Garnock-Jones <tonyg@leastfixedpoint.com> +Tested-by: Ian Rogers <irogers@google.com> +Cc: Ingo Molnar <mingo@redhat.com> +Cc: Peter Zijlstra <peterz@infradead.org> +Link: https://lore.kernel.org/r/20210916120939.453536-1-tonyg@leastfixedpoint.com +Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> +--- + tools/perf/util/srcline.c | 338 ++++++++++++++++++++++++++++---------- + 1 file changed, 250 insertions(+), 88 deletions(-) + +diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c +index 5b7d6c16d33f..af468e3bb6fa 100644 +--- a/tools/perf/util/srcline.c ++++ b/tools/perf/util/srcline.c +@@ -1,8 +1,10 @@ + // SPDX-License-Identifier: GPL-2.0 + #include <inttypes.h> ++#include <signal.h> + #include <stdio.h> + #include <stdlib.h> + #include <string.h> ++#include <sys/types.h> + + #include <linux/kernel.h> + #include <linux/string.h> +@@ -15,6 +17,7 @@ + #include "srcline.h" + #include "string2.h" + #include "symbol.h" ++#include "subcmd/run-command.h" + + bool srcline_full_filename; + +@@ -119,6 +122,8 @@ static struct symbol *new_inline_sym(struct dso *dso, + return inline_sym; + } + ++#define MAX_INLINE_NEST 1024 ++ + #ifdef HAVE_LIBBFD_SUPPORT + + /* +@@ -273,8 +278,6 @@ static void addr2line_cleanup(struct a2l_data *a2l) + free(a2l); + } + +-#define MAX_INLINE_NEST 1024 +- + static int inline_list__append_dso_a2l(struct dso *dso, + struct inline_node *node, + struct symbol *sym) +@@ -361,26 +364,14 @@ void dso__free_a2l(struct dso *dso) + dso->a2l = NULL; + } + +-static struct inline_node *addr2inlines(const char *dso_name, u64 addr, +- struct dso *dso, struct symbol *sym) +-{ +- struct inline_node *node; +- +- node = zalloc(sizeof(*node)); +- if (node == NULL) { +- perror("not enough memory for the inline node"); +- return NULL; +- } +- +- INIT_LIST_HEAD(&node->val); +- node->addr = addr; +- +- addr2line(dso_name, addr, NULL, NULL, dso, true, node, sym); +- return node; +-} +- + #else /* HAVE_LIBBFD_SUPPORT */ + ++struct a2l_subprocess { ++ struct child_process addr2line; ++ FILE *to_child; ++ FILE *from_child; ++}; ++ + static int filename_split(char *filename, unsigned int *line_nr) + { + char *sep; +@@ -402,114 +393,285 @@ static int filename_split(char *filename, unsigned int *line_nr) + return 0; + } + +-static int addr2line(const char *dso_name, u64 addr, +- char **file, unsigned int *line_nr, +- struct dso *dso __maybe_unused, +- bool unwind_inlines __maybe_unused, +- struct inline_node *node __maybe_unused, +- struct symbol *sym __maybe_unused) ++static void addr2line_subprocess_cleanup(struct a2l_subprocess *a2l) + { +- FILE *fp; +- char cmd[PATH_MAX]; +- char *filename = NULL; +- size_t len; +- int ret = 0; ++ if (a2l->addr2line.pid != -1) { ++ kill(a2l->addr2line.pid, SIGKILL); ++ finish_command(&a2l->addr2line); /* ignore result, we don't care */ ++ a2l->addr2line.pid = -1; ++ } + +- scnprintf(cmd, sizeof(cmd), "addr2line -e %s %016"PRIx64, +- dso_name, addr); ++ if (a2l->to_child != NULL) { ++ fclose(a2l->to_child); ++ a2l->to_child = NULL; ++ } + +- fp = popen(cmd, "r"); +- if (fp == NULL) { +- pr_warning("popen failed for %s\n", dso_name); +- return 0; ++ if (a2l->from_child != NULL) { ++ fclose(a2l->from_child); ++ a2l->from_child = NULL; ++ } ++ ++ free(a2l); ++} ++ ++static struct a2l_subprocess *addr2line_subprocess_init(const char *path) ++{ ++ const char *argv[] = { "addr2line", "-e", path, "-i", "-f", NULL }; ++ struct a2l_subprocess *a2l = zalloc(sizeof(*a2l)); ++ int start_command_status = 0; ++ ++ if (a2l == NULL) ++ goto out; ++ ++ a2l->to_child = NULL; ++ a2l->from_child = NULL; ++ ++ a2l->addr2line.pid = -1; ++ a2l->addr2line.in = -1; ++ a2l->addr2line.out = -1; ++ a2l->addr2line.no_stderr = 1; ++ ++ a2l->addr2line.argv = argv; ++ start_command_status = start_command(&a2l->addr2line); ++ a2l->addr2line.argv = NULL; /* it's not used after start_command; avoid dangling pointers */ ++ ++ if (start_command_status != 0) { ++ pr_warning("could not start addr2line for %s: start_command return code %d\n", ++ path, ++ start_command_status); ++ goto out; + } + +- if (getline(&filename, &len, fp) < 0 || !len) { +- pr_warning("addr2line has no output for %s\n", dso_name); ++ a2l->to_child = fdopen(a2l->addr2line.in, "w"); ++ if (a2l->to_child == NULL) { ++ pr_warning("could not open write-stream to addr2line of %s\n", path); + goto out; + } + +- ret = filename_split(filename, line_nr); +- if (ret != 1) { +- free(filename); ++ a2l->from_child = fdopen(a2l->addr2line.out, "r"); ++ if (a2l->from_child == NULL) { ++ pr_warning("could not open read-stream from addr2line of %s\n", path); + goto out; + } + +- *file = filename; ++ return a2l; + + out: +- pclose(fp); +- return ret; ++ if (a2l) ++ addr2line_subprocess_cleanup(a2l); ++ ++ return NULL; + } + +-void dso__free_a2l(struct dso *dso __maybe_unused) ++static int read_addr2line_record(struct a2l_subprocess *a2l, ++ char **function, ++ char **filename, ++ unsigned int *line_nr) + { ++ /* ++ * Returns: ++ * -1 ==> error ++ * 0 ==> sentinel (or other ill-formed) record read ++ * 1 ==> a genuine record read ++ */ ++ char *line = NULL; ++ size_t line_len = 0; ++ unsigned int dummy_line_nr = 0; ++ int ret = -1; ++ ++ if (function != NULL) ++ zfree(function); ++ ++ if (filename != NULL) ++ zfree(filename); ++ ++ if (line_nr != NULL) ++ *line_nr = 0; ++ ++ if (getline(&line, &line_len, a2l->from_child) < 0 || !line_len) ++ goto error; ++ ++ if (function != NULL) ++ *function = strdup(strim(line)); ++ ++ zfree(&line); ++ line_len = 0; ++ ++ if (getline(&line, &line_len, a2l->from_child) < 0 || !line_len) ++ goto error; ++ ++ if (filename_split(line, line_nr == NULL ? &dummy_line_nr : line_nr) == 0) { ++ ret = 0; ++ goto error; ++ } ++ ++ if (filename != NULL) ++ *filename = strdup(line); ++ ++ zfree(&line); ++ line_len = 0; ++ ++ return 1; ++ ++error: ++ free(line); ++ if (function != NULL) ++ zfree(function); ++ if (filename != NULL) ++ zfree(filename); ++ return ret; + } + +-static struct inline_node *addr2inlines(const char *dso_name, u64 addr, +- struct dso *dso __maybe_unused, +- struct symbol *sym) ++static int inline_list__append_record(struct dso *dso, ++ struct inline_node *node, ++ struct symbol *sym, ++ const char *function, ++ const char *filename, ++ unsigned int line_nr) + { +- FILE *fp; +- char cmd[PATH_MAX]; +- struct inline_node *node; +- char *filename = NULL; +- char *funcname = NULL; +- size_t filelen, funclen; +- unsigned int line_nr = 0; ++ struct symbol *inline_sym = new_inline_sym(dso, sym, function); + +- scnprintf(cmd, sizeof(cmd), "addr2line -e %s -i -f %016"PRIx64, +- dso_name, addr); ++ return inline_list__append(inline_sym, srcline_from_fileline(filename, line_nr), node); ++} + +- fp = popen(cmd, "r"); +- if (fp == NULL) { +- pr_err("popen failed for %s\n", dso_name); +- return NULL; ++static int addr2line(const char *dso_name, u64 addr, ++ char **file, unsigned int *line_nr, ++ struct dso *dso, ++ bool unwind_inlines, ++ struct inline_node *node, ++ struct symbol *sym __maybe_unused) ++{ ++ struct a2l_subprocess *a2l = dso->a2l; ++ char *record_function = NULL; ++ char *record_filename = NULL; ++ unsigned int record_line_nr = 0; ++ int record_status = -1; ++ int ret = 0; ++ size_t inline_count = 0; ++ ++ if (!a2l) { ++ dso->a2l = addr2line_subprocess_init(dso_name); ++ a2l = dso->a2l; + } + +- node = zalloc(sizeof(*node)); +- if (node == NULL) { +- perror("not enough memory for the inline node"); ++ if (a2l == NULL) { ++ if (!symbol_conf.disable_add2line_warn) ++ pr_warning("%s %s: addr2line_subprocess_init failed\n", __func__, dso_name); + goto out; + } + +- INIT_LIST_HEAD(&node->val); +- node->addr = addr; +- +- /* addr2line -f generates two lines for each inlined functions */ +- while (getline(&funcname, &funclen, fp) != -1) { +- char *srcline; +- struct symbol *inline_sym; ++ /* ++ * Send our request and then *deliberately* send something that can't be interpreted as ++ * a valid address to ask addr2line about (namely, ","). This causes addr2line to first ++ * write out the answer to our request, in an unbounded/unknown number of records, and ++ * then to write out the lines "??" and "??:0", so that we can detect when it has ++ * finished giving us anything useful. We have to be careful about the first record, ++ * though, because it may be genuinely unknown, in which case we'll get two sets of ++ * "??"/"??:0" lines. ++ */ ++ if (fprintf(a2l->to_child, "%016"PRIx64"\n,\n", addr) < 0 || fflush(a2l->to_child) != 0) { ++ pr_warning("%s %s: could not send request\n", __func__, dso_name); ++ goto out; ++ } + +- strim(funcname); ++ switch (read_addr2line_record(a2l, &record_function, &record_filename, &record_line_nr)) { ++ case -1: ++ pr_warning("%s %s: could not read first record\n", __func__, dso_name); ++ goto out; ++ case 0: ++ /* ++ * The first record was invalid, so return failure, but first read another ++ * record, since we asked a junk question and have to clear the answer out. ++ */ ++ switch (read_addr2line_record(a2l, NULL, NULL, NULL)) { ++ case -1: ++ pr_warning("%s %s: could not read delimiter record\n", __func__, dso_name); ++ break; ++ case 0: ++ /* As expected. */ ++ break; ++ default: ++ pr_warning("%s %s: unexpected record instead of sentinel", ++ __func__, dso_name); ++ break; ++ } ++ goto out; ++ default: ++ break; ++ } + +- if (getline(&filename, &filelen, fp) == -1) +- goto out; ++ if (file) { ++ *file = strdup(record_filename); ++ ret = 1; ++ } ++ if (line_nr) ++ *line_nr = record_line_nr; + +- if (filename_split(filename, &line_nr) != 1) ++ if (unwind_inlines) { ++ if (node && inline_list__append_record(dso, node, sym, ++ record_function, ++ record_filename, ++ record_line_nr)) { ++ ret = 0; + goto out; ++ } ++ } + +- srcline = srcline_from_fileline(filename, line_nr); +- inline_sym = new_inline_sym(dso, sym, funcname); +- +- if (inline_list__append(inline_sym, srcline, node) != 0) { +- free(srcline); +- if (inline_sym && inline_sym->inlined) +- symbol__delete(inline_sym); +- goto out; ++ /* We have to read the records even if we don't care about the inline info. */ ++ while ((record_status = read_addr2line_record(a2l, ++ &record_function, ++ &record_filename, ++ &record_line_nr)) == 1) { ++ if (unwind_inlines && node && inline_count++ < MAX_INLINE_NEST) { ++ if (inline_list__append_record(dso, node, sym, ++ record_function, ++ record_filename, ++ record_line_nr)) { ++ ret = 0; ++ goto out; ++ } ++ ret = 1; /* found at least one inline frame */ + } + } + + out: +- pclose(fp); +- free(filename); +- free(funcname); ++ free(record_function); ++ free(record_filename); ++ return ret; ++} + +- return node; ++void dso__free_a2l(struct dso *dso) ++{ ++ struct a2l_subprocess *a2l = dso->a2l; ++ ++ if (!a2l) ++ return; ++ ++ addr2line_subprocess_cleanup(a2l); ++ ++ dso->a2l = NULL; + } + + #endif /* HAVE_LIBBFD_SUPPORT */ + ++static struct inline_node *addr2inlines(const char *dso_name, u64 addr, ++ struct dso *dso, struct symbol *sym) ++{ ++ struct inline_node *node; ++ ++ node = zalloc(sizeof(*node)); ++ if (node == NULL) { ++ perror("not enough memory for the inline node"); ++ return NULL; ++ } ++ ++ INIT_LIST_HEAD(&node->val); ++ node->addr = addr; ++ ++ addr2line(dso_name, addr, NULL, NULL, dso, true, node, sym); ++ return node; ++} ++ + /* + * Number of addr2line failures (without success) before disabling it for that + * dso. +-- +2.33.1 + diff --git a/debian/patches/bugfix/all/radeon-amdgpu-firmware-is-required-for-drm-and-kms-on-r600-onward.patch b/debian/patches/bugfix/all/radeon-amdgpu-firmware-is-required-for-drm-and-kms-on-r600-onward.patch index bf508eaf3..e0372e4a0 100644 --- a/debian/patches/bugfix/all/radeon-amdgpu-firmware-is-required-for-drm-and-kms-on-r600-onward.patch +++ b/debian/patches/bugfix/all/radeon-amdgpu-firmware-is-required-for-drm-and-kms-on-r600-onward.patch @@ -41,7 +41,7 @@ Index: linux/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c #include "amdgpu.h" #include "amdgpu_irq.h" -@@ -1227,6 +1229,28 @@ MODULE_DEVICE_TABLE(pci, pciidlist); +@@ -1246,6 +1248,28 @@ MODULE_DEVICE_TABLE(pci, pciidlist); static const struct drm_driver amdgpu_kms_driver; @@ -70,7 +70,7 @@ Index: linux/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c static int amdgpu_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent) { -@@ -1291,6 +1315,11 @@ static int amdgpu_pci_probe(struct pci_d +@@ -1310,6 +1334,11 @@ static int amdgpu_pci_probe(struct pci_d } #endif @@ -80,7 +80,7 @@ Index: linux/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c + } + /* Get rid of things like offb */ - ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, "amdgpudrmfb"); + ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, &amdgpu_kms_driver); if (ret) Index: linux/drivers/gpu/drm/radeon/radeon_drv.c =================================================================== @@ -135,5 +135,5 @@ Index: linux/drivers/gpu/drm/radeon/radeon_drv.c + } + /* Get rid of things like offb */ - ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, "radeondrmfb"); + ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev, &kms_driver); if (ret) diff --git a/debian/patches/bugfix/all/tools-perf-pmu-events-fix-reproducibility.patch b/debian/patches/bugfix/all/tools-perf-pmu-events-fix-reproducibility.patch index c0cf9b073..f6deb46bf 100644 --- a/debian/patches/bugfix/all/tools-perf-pmu-events-fix-reproducibility.patch +++ b/debian/patches/bugfix/all/tools-perf-pmu-events-fix-reproducibility.patch @@ -38,7 +38,7 @@ Signed-off-by: Ben Hutchings <ben@decadent.org.uk> int verbose; char *prog; -@@ -971,6 +983,78 @@ +@@ -971,6 +983,78 @@ static int get_maxfds(void) */ static FILE *eventsfp; static char *mapfile; @@ -109,7 +109,7 @@ Signed-off-by: Ben Hutchings <ben@decadent.org.uk> + +out: + for (i = 0; i < state.n; i++) -+ free(state.entries[i].fpath); ++ free((char *)state.entries[i].fpath); + free(state.entries);; + + return rc; @@ -117,7 +117,7 @@ Signed-off-by: Ben Hutchings <ben@decadent.org.uk> static int is_leaf_dir(const char *fpath) { -@@ -1023,19 +1107,19 @@ +@@ -1023,19 +1107,19 @@ static int is_json_file(const char *name return 0; } @@ -140,7 +140,7 @@ Signed-off-by: Ben Hutchings <ben@decadent.org.uk> int typeflag, struct FTW *ftwbuf) { char *tblname, *bname; -@@ -1065,9 +1149,9 @@ +@@ -1065,9 +1149,9 @@ static int process_one_file(const char * } else bname = (char *) fpath + ftwbuf->base; @@ -152,7 +152,7 @@ Signed-off-by: Ben Hutchings <ben@decadent.org.uk> /* base dir or too deep */ if (level == 0 || level > 4) -@@ -1241,21 +1325,21 @@ +@@ -1241,21 +1325,21 @@ int main(int argc, char *argv[]) */ maxfds = get_maxfds(); diff --git a/debian/patches/bugfix/arm/ARM-dts-sun7i-A20-olinuxino-lime2-Fix-ethernet-phy-m.patch b/debian/patches/bugfix/arm/ARM-dts-sun7i-A20-olinuxino-lime2-Fix-ethernet-phy-m.patch deleted file mode 100644 index 48028d439..000000000 --- a/debian/patches/bugfix/arm/ARM-dts-sun7i-A20-olinuxino-lime2-Fix-ethernet-phy-m.patch +++ /dev/null @@ -1,38 +0,0 @@ -From: =?UTF-8?q?Bastien=20Roucari=C3=A8s?= <rouca@debian.org> -Date: Thu, 16 Sep 2021 08:17:21 +0000 -Subject: ARM: dts: sun7i: A20-olinuxino-lime2: Fix ethernet phy-mode -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit -Origin: https://lore.kernel.org/r/20210916081721.237137-1-rouca@debian.org - -Commit bbc4d71d6354 ("net: phy: realtek: fix rtl8211e rx/tx delay -config") sets the RX/TX delay according to the phy-mode property in the -device tree. For the A20-olinuxino-lime2 board this is "rgmii", which is the -wrong setting. - -Following the example of a900cac3750b ("ARM: dts: sun7i: a20: bananapro: -Fix ethernet phy-mode") the phy-mode is changed to "rgmii-id" which gets -the Ethernet working again on this board. - -Signed-off-by: Bastien Roucariès <rouca@debian.org> ---- - arch/arm/boot/dts/sun7i-a20-olinuxino-lime2.dts | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/arch/arm/boot/dts/sun7i-a20-olinuxino-lime2.dts b/arch/arm/boot/dts/sun7i-a20-olinuxino-lime2.dts -index 8077f1716fbc..ecb91fb899ff 100644 ---- a/arch/arm/boot/dts/sun7i-a20-olinuxino-lime2.dts -+++ b/arch/arm/boot/dts/sun7i-a20-olinuxino-lime2.dts -@@ -112,7 +112,7 @@ &gmac { - pinctrl-names = "default"; - pinctrl-0 = <&gmac_rgmii_pins>; - phy-handle = <&phy1>; -- phy-mode = "rgmii"; -+ phy-mode = "rgmii-id"; - status = "okay"; - }; - --- -2.33.0 - diff --git a/debian/patches/bugfix/mipsel/bpf-mips-Validate-conditional-branch-offsets.patch b/debian/patches/bugfix/mipsel/bpf-mips-Validate-conditional-branch-offsets.patch deleted file mode 100644 index 98c306840..000000000 --- a/debian/patches/bugfix/mipsel/bpf-mips-Validate-conditional-branch-offsets.patch +++ /dev/null @@ -1,267 +0,0 @@ -From: Piotr Krysiuk <piotras@gmail.com> -Date: Wed, 15 Sep 2021 17:04:37 +0100 -Subject: bpf, mips: Validate conditional branch offsets -Origin: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/commit/?id=37cb28ec7d3a36a5bace7063a3dba633ab110f8b -Bug-Debian-Security: https://security-tracker.debian.org/tracker/CVE-2021-38300 - -The conditional branch instructions on MIPS use 18-bit signed offsets -allowing for a branch range of 128 KBytes (backward and forward). -However, this limit is not observed by the cBPF JIT compiler, and so -the JIT compiler emits out-of-range branches when translating certain -cBPF programs. A specific example of such a cBPF program is included in -the "BPF_MAXINSNS: exec all MSH" test from lib/test_bpf.c that executes -anomalous machine code containing incorrect branch offsets under JIT. - -Furthermore, this issue can be abused to craft undesirable machine -code, where the control flow is hijacked to execute arbitrary Kernel -code. - -The following steps can be used to reproduce the issue: - - # echo 1 > /proc/sys/net/core/bpf_jit_enable - # modprobe test_bpf test_name="BPF_MAXINSNS: exec all MSH" - -This should produce multiple warnings from build_bimm() similar to: - - ------------[ cut here ]------------ - WARNING: CPU: 0 PID: 209 at arch/mips/mm/uasm-mips.c:210 build_insn+0x558/0x590 - Micro-assembler field overflow - Modules linked in: test_bpf(+) - CPU: 0 PID: 209 Comm: modprobe Not tainted 5.14.3 #1 - Stack : 00000000 807bb824 82b33c9c 801843c0 00000000 00000004 00000000 63c9b5ee - 82b33af4 80999898 80910000 80900000 82fd6030 00000001 82b33a98 82087180 - 00000000 00000000 80873b28 00000000 000000fc 82b3394c 00000000 2e34312e - 6d6d6f43 809a180f 809a1836 6f6d203a 80900000 00000001 82b33bac 80900000 - 00027f80 00000000 00000000 807bb824 00000000 804ed790 001cc317 00000001 - [...] - Call Trace: - [<80108f44>] show_stack+0x38/0x118 - [<807a7aac>] dump_stack_lvl+0x5c/0x7c - [<807a4b3c>] __warn+0xcc/0x140 - [<807a4c3c>] warn_slowpath_fmt+0x8c/0xb8 - [<8011e198>] build_insn+0x558/0x590 - [<8011e358>] uasm_i_bne+0x20/0x2c - [<80127b48>] build_body+0xa58/0x2a94 - [<80129c98>] bpf_jit_compile+0x114/0x1e4 - [<80613fc4>] bpf_prepare_filter+0x2ec/0x4e4 - [<8061423c>] bpf_prog_create+0x80/0xc4 - [<c0a006e4>] test_bpf_init+0x300/0xba8 [test_bpf] - [<8010051c>] do_one_initcall+0x50/0x1d4 - [<801c5e54>] do_init_module+0x60/0x220 - [<801c8b20>] sys_finit_module+0xc4/0xfc - [<801144d0>] syscall_common+0x34/0x58 - [...] - ---[ end trace a287d9742503c645 ]--- - -Then the anomalous machine code executes: - -=> 0xc0a18000: addiu sp,sp,-16 - 0xc0a18004: sw s3,0(sp) - 0xc0a18008: sw s4,4(sp) - 0xc0a1800c: sw s5,8(sp) - 0xc0a18010: sw ra,12(sp) - 0xc0a18014: move s5,a0 - 0xc0a18018: move s4,zero - 0xc0a1801c: move s3,zero - - # __BPF_STMT(BPF_LDX | BPF_B | BPF_MSH, 0) - 0xc0a18020: lui t6,0x8012 - 0xc0a18024: ori t4,t6,0x9e14 - 0xc0a18028: li a1,0 - 0xc0a1802c: jalr t4 - 0xc0a18030: move a0,s5 - 0xc0a18034: bnez v0,0xc0a1ffb8 # incorrect branch offset - 0xc0a18038: move v0,zero - 0xc0a1803c: andi s4,s3,0xf - 0xc0a18040: b 0xc0a18048 - 0xc0a18044: sll s4,s4,0x2 - [...] - - # __BPF_STMT(BPF_LDX | BPF_B | BPF_MSH, 0) - 0xc0a1ffa0: lui t6,0x8012 - 0xc0a1ffa4: ori t4,t6,0x9e14 - 0xc0a1ffa8: li a1,0 - 0xc0a1ffac: jalr t4 - 0xc0a1ffb0: move a0,s5 - 0xc0a1ffb4: bnez v0,0xc0a1ffb8 # incorrect branch offset - 0xc0a1ffb8: move v0,zero - 0xc0a1ffbc: andi s4,s3,0xf - 0xc0a1ffc0: b 0xc0a1ffc8 - 0xc0a1ffc4: sll s4,s4,0x2 - - # __BPF_STMT(BPF_LDX | BPF_B | BPF_MSH, 0) - 0xc0a1ffc8: lui t6,0x8012 - 0xc0a1ffcc: ori t4,t6,0x9e14 - 0xc0a1ffd0: li a1,0 - 0xc0a1ffd4: jalr t4 - 0xc0a1ffd8: move a0,s5 - 0xc0a1ffdc: bnez v0,0xc0a3ffb8 # correct branch offset - 0xc0a1ffe0: move v0,zero - 0xc0a1ffe4: andi s4,s3,0xf - 0xc0a1ffe8: b 0xc0a1fff0 - 0xc0a1ffec: sll s4,s4,0x2 - [...] - - # epilogue - 0xc0a3ffb8: lw s3,0(sp) - 0xc0a3ffbc: lw s4,4(sp) - 0xc0a3ffc0: lw s5,8(sp) - 0xc0a3ffc4: lw ra,12(sp) - 0xc0a3ffc8: addiu sp,sp,16 - 0xc0a3ffcc: jr ra - 0xc0a3ffd0: nop - -To mitigate this issue, we assert the branch ranges for each emit call -that could generate an out-of-range branch. - -Fixes: 36366e367ee9 ("MIPS: BPF: Restore MIPS32 cBPF JIT") -Fixes: c6610de353da ("MIPS: net: Add BPF JIT") -Signed-off-by: Piotr Krysiuk <piotras@gmail.com> -Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> -Tested-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> -Acked-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> -Cc: Paul Burton <paulburton@kernel.org> -Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> -Link: https://lore.kernel.org/bpf/20210915160437.4080-1-piotras@gmail.com ---- - arch/mips/net/bpf_jit.c | 57 +++++++++++++++++++++++++++++++---------- - 1 file changed, 43 insertions(+), 14 deletions(-) - -diff --git a/arch/mips/net/bpf_jit.c b/arch/mips/net/bpf_jit.c -index 0af88622c619..cb6d22439f71 100644 ---- a/arch/mips/net/bpf_jit.c -+++ b/arch/mips/net/bpf_jit.c -@@ -662,6 +662,11 @@ static void build_epilogue(struct jit_ctx *ctx) - ((int)K < 0 ? ((int)K >= SKF_LL_OFF ? func##_negative : func) : \ - func##_positive) - -+static bool is_bad_offset(int b_off) -+{ -+ return b_off > 0x1ffff || b_off < -0x20000; -+} -+ - static int build_body(struct jit_ctx *ctx) - { - const struct bpf_prog *prog = ctx->skf; -@@ -728,7 +733,10 @@ static int build_body(struct jit_ctx *ctx) - /* Load return register on DS for failures */ - emit_reg_move(r_ret, r_zero, ctx); - /* Return with error */ -- emit_b(b_imm(prog->len, ctx), ctx); -+ b_off = b_imm(prog->len, ctx); -+ if (is_bad_offset(b_off)) -+ return -E2BIG; -+ emit_b(b_off, ctx); - emit_nop(ctx); - break; - case BPF_LD | BPF_W | BPF_IND: -@@ -775,8 +783,10 @@ static int build_body(struct jit_ctx *ctx) - emit_jalr(MIPS_R_RA, r_s0, ctx); - emit_reg_move(MIPS_R_A0, r_skb, ctx); /* delay slot */ - /* Check the error value */ -- emit_bcond(MIPS_COND_NE, r_ret, 0, -- b_imm(prog->len, ctx), ctx); -+ b_off = b_imm(prog->len, ctx); -+ if (is_bad_offset(b_off)) -+ return -E2BIG; -+ emit_bcond(MIPS_COND_NE, r_ret, 0, b_off, ctx); - emit_reg_move(r_ret, r_zero, ctx); - /* We are good */ - /* X <- P[1:K] & 0xf */ -@@ -855,8 +865,10 @@ static int build_body(struct jit_ctx *ctx) - /* A /= X */ - ctx->flags |= SEEN_X | SEEN_A; - /* Check if r_X is zero */ -- emit_bcond(MIPS_COND_EQ, r_X, r_zero, -- b_imm(prog->len, ctx), ctx); -+ b_off = b_imm(prog->len, ctx); -+ if (is_bad_offset(b_off)) -+ return -E2BIG; -+ emit_bcond(MIPS_COND_EQ, r_X, r_zero, b_off, ctx); - emit_load_imm(r_ret, 0, ctx); /* delay slot */ - emit_div(r_A, r_X, ctx); - break; -@@ -864,8 +876,10 @@ static int build_body(struct jit_ctx *ctx) - /* A %= X */ - ctx->flags |= SEEN_X | SEEN_A; - /* Check if r_X is zero */ -- emit_bcond(MIPS_COND_EQ, r_X, r_zero, -- b_imm(prog->len, ctx), ctx); -+ b_off = b_imm(prog->len, ctx); -+ if (is_bad_offset(b_off)) -+ return -E2BIG; -+ emit_bcond(MIPS_COND_EQ, r_X, r_zero, b_off, ctx); - emit_load_imm(r_ret, 0, ctx); /* delay slot */ - emit_mod(r_A, r_X, ctx); - break; -@@ -926,7 +940,10 @@ static int build_body(struct jit_ctx *ctx) - break; - case BPF_JMP | BPF_JA: - /* pc += K */ -- emit_b(b_imm(i + k + 1, ctx), ctx); -+ b_off = b_imm(i + k + 1, ctx); -+ if (is_bad_offset(b_off)) -+ return -E2BIG; -+ emit_b(b_off, ctx); - emit_nop(ctx); - break; - case BPF_JMP | BPF_JEQ | BPF_K: -@@ -1056,12 +1073,16 @@ static int build_body(struct jit_ctx *ctx) - break; - case BPF_RET | BPF_A: - ctx->flags |= SEEN_A; -- if (i != prog->len - 1) -+ if (i != prog->len - 1) { - /* - * If this is not the last instruction - * then jump to the epilogue - */ -- emit_b(b_imm(prog->len, ctx), ctx); -+ b_off = b_imm(prog->len, ctx); -+ if (is_bad_offset(b_off)) -+ return -E2BIG; -+ emit_b(b_off, ctx); -+ } - emit_reg_move(r_ret, r_A, ctx); /* delay slot */ - break; - case BPF_RET | BPF_K: -@@ -1075,7 +1096,10 @@ static int build_body(struct jit_ctx *ctx) - * If this is not the last instruction - * then jump to the epilogue - */ -- emit_b(b_imm(prog->len, ctx), ctx); -+ b_off = b_imm(prog->len, ctx); -+ if (is_bad_offset(b_off)) -+ return -E2BIG; -+ emit_b(b_off, ctx); - emit_nop(ctx); - } - break; -@@ -1133,8 +1157,10 @@ static int build_body(struct jit_ctx *ctx) - /* Load *dev pointer */ - emit_load_ptr(r_s0, r_skb, off, ctx); - /* error (0) in the delay slot */ -- emit_bcond(MIPS_COND_EQ, r_s0, r_zero, -- b_imm(prog->len, ctx), ctx); -+ b_off = b_imm(prog->len, ctx); -+ if (is_bad_offset(b_off)) -+ return -E2BIG; -+ emit_bcond(MIPS_COND_EQ, r_s0, r_zero, b_off, ctx); - emit_reg_move(r_ret, r_zero, ctx); - if (code == (BPF_ANC | SKF_AD_IFINDEX)) { - BUILD_BUG_ON(sizeof_field(struct net_device, ifindex) != 4); -@@ -1244,7 +1270,10 @@ void bpf_jit_compile(struct bpf_prog *fp) - - /* Generate the actual JIT code */ - build_prologue(&ctx); -- build_body(&ctx); -+ if (build_body(&ctx)) { -+ module_memfree(ctx.target); -+ goto out; -+ } - build_epilogue(&ctx); - - /* Update the icache */ --- -2.33.0 - diff --git a/debian/patches/bugfix/powerpc/powerpc-boot-fix-missing-crc32poly.h-when-building-with-kernel_xz.patch b/debian/patches/bugfix/powerpc/powerpc-boot-fix-missing-crc32poly.h-when-building-with-kernel_xz.patch index 0b3acfadb..a00a2a485 100644 --- a/debian/patches/bugfix/powerpc/powerpc-boot-fix-missing-crc32poly.h-when-building-with-kernel_xz.patch +++ b/debian/patches/bugfix/powerpc/powerpc-boot-fix-missing-crc32poly.h-when-building-with-kernel_xz.patch @@ -24,14 +24,14 @@ Tested-by: Michal Kubecek <mkubecek@suse.cz> arch/powerpc/boot/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile -index 0fb96c26136f..ba4182fb185d 100644 ---- a/arch/powerpc/boot/Makefile -+++ b/arch/powerpc/boot/Makefile -@@ -63,7 +63,7 @@ ifeq ($(call cc-option-yn, -fstack-protector),y) - BOOTCFLAGS += -fno-stack-protector +Index: linux/arch/powerpc/boot/Makefile +=================================================================== +--- linux.orig/arch/powerpc/boot/Makefile ++++ linux/arch/powerpc/boot/Makefile +@@ -70,7 +70,7 @@ BOOTCFLAGS += -fno-stack-protector endif + BOOTCFLAGS += -include $(srctree)/include/linux/compiler_attributes.h -BOOTCFLAGS += -I$(objtree)/$(obj) -I$(srctree)/$(obj) +BOOTCFLAGS += -I$(objtree)/$(obj) -I$(srctree)/$(obj) -I$(srctree)/include diff --git a/debian/patches/bugfix/sh/sh-boot-do-not-use-hyphen-in-exported-variable-name.patch b/debian/patches/bugfix/sh/sh-boot-do-not-use-hyphen-in-exported-variable-name.patch index 25e5b1044..b48756ac9 100644 --- a/debian/patches/bugfix/sh/sh-boot-do-not-use-hyphen-in-exported-variable-name.patch +++ b/debian/patches/bugfix/sh/sh-boot-do-not-use-hyphen-in-exported-variable-name.patch @@ -23,8 +23,10 @@ Signed-off-by: Ben Hutchings <ben@decadent.org.uk> arch/sh/boot/romimage/Makefile | 4 ++-- 4 files changed, 18 insertions(+), 18 deletions(-) ---- a/arch/sh/Makefile -+++ b/arch/sh/Makefile +Index: linux/arch/sh/Makefile +=================================================================== +--- linux.orig/arch/sh/Makefile ++++ linux/arch/sh/Makefile @@ -102,16 +102,16 @@ UTS_MACHINE := sh LDFLAGS_vmlinux += -e _stext @@ -47,8 +49,10 @@ Signed-off-by: Ben Hutchings <ben@decadent.org.uk> head-y := arch/sh/kernel/head_32.o ---- a/arch/sh/boot/Makefile -+++ b/arch/sh/boot/Makefile +Index: linux/arch/sh/boot/Makefile +=================================================================== +--- linux.orig/arch/sh/boot/Makefile ++++ linux/arch/sh/boot/Makefile @@ -19,12 +19,12 @@ CONFIG_ZERO_PAGE_OFFSET ?= 0x00001000 CONFIG_ENTRY_OFFSET ?= 0x00001000 CONFIG_PHYSICAL_START ?= $(CONFIG_MEMORY_START) @@ -69,7 +73,7 @@ Signed-off-by: Ben Hutchings <ben@decadent.org.uk> targets := zImage vmlinux.srec romImage uImage uImage.srec uImage.gz \ uImage.bz2 uImage.lzma uImage.xz uImage.lzo uImage.bin @@ -106,10 +106,10 @@ OBJCOPYFLAGS_uImage.srec := -I binary -O - $(obj)/uImage.srec: $(obj)/uImage + $(obj)/uImage.srec: $(obj)/uImage FORCE $(call if_changed,objcopy) -$(obj)/uImage: $(obj)/uImage.$(suffix-y) @@ -81,8 +85,10 @@ Signed-off-by: Ben Hutchings <ben@decadent.org.uk> CONFIG_PHYSICAL_START CONFIG_ZERO_PAGE_OFFSET CONFIG_ENTRY_OFFSET \ - KERNEL_MEMORY suffix-y + KERNEL_MEMORY suffix_y ---- a/arch/sh/boot/compressed/Makefile -+++ b/arch/sh/boot/compressed/Makefile +Index: linux/arch/sh/boot/compressed/Makefile +=================================================================== +--- linux.orig/arch/sh/boot/compressed/Makefile ++++ linux/arch/sh/boot/compressed/Makefile @@ -30,7 +30,7 @@ endif ccflags-remove-$(CONFIG_MCOUNT) += -pg @@ -102,8 +108,10 @@ Signed-off-by: Ben Hutchings <ben@decadent.org.uk> -$(obj)/piggy.o: $(obj)/vmlinux.scr $(obj)/vmlinux.bin.$(suffix-y) FORCE +$(obj)/piggy.o: $(obj)/vmlinux.scr $(obj)/vmlinux.bin.$(suffix_y) FORCE $(call if_changed,ld) ---- a/arch/sh/boot/romimage/Makefile -+++ b/arch/sh/boot/romimage/Makefile +Index: linux/arch/sh/boot/romimage/Makefile +=================================================================== +--- linux.orig/arch/sh/boot/romimage/Makefile ++++ linux/arch/sh/boot/romimage/Makefile @@ -13,7 +13,7 @@ mmcif-obj-$(CONFIG_CPU_SUBTYPE_SH7724) : load-$(CONFIG_ROMIMAGE_MMCIF) := $(mmcif-load-y) obj-$(CONFIG_ROMIMAGE_MMCIF) := $(mmcif-obj-y) diff --git a/debian/patches/bugfix/x86/Revert-drm-i915-Implement-Wa_1508744258.patch b/debian/patches/bugfix/x86/Revert-drm-i915-Implement-Wa_1508744258.patch new file mode 100644 index 000000000..9280eff97 --- /dev/null +++ b/debian/patches/bugfix/x86/Revert-drm-i915-Implement-Wa_1508744258.patch @@ -0,0 +1,53 @@ +From: =?UTF-8?q?Jos=C3=A9=20Roberto=20de=20Souza?= <jose.souza@intel.com> +Date: Fri, 19 Nov 2021 06:09:30 -0800 +Subject: Revert "drm/i915: Implement Wa_1508744258" +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit +Origin: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit?id=894b21da042f94f23f1bcbe7362b54b1657aa345 +Bug-Debian: https://bugs.debian.org/1001128 + +[ Upstream commit 72641d8d60401a5f1e1a0431ceaf928680d34418 ] + +This workarounds are causing hangs, because I missed the fact that it +needs to be enabled for all cases and disabled when doing a resolve +pass. + +So KMD only needs to whitelist it and UMD will be the one setting it +on per case. + +This reverts commit 28ec02c9cbebf3feeaf21a59df9dfbc02bda3362. + +Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/4145 +Signed-off-by: José Roberto de Souza <jose.souza@intel.com> +Fixes: 28ec02c9cbeb ("drm/i915: Implement Wa_1508744258") +Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com> +Link: https://patchwork.freedesktop.org/patch/msgid/20211119140931.32791-1-jose.souza@intel.com +(cherry picked from commit f3799ff16fcfacd44aee55db162830df461b631f) +Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> +Signed-off-by: Sasha Levin <sashal@kernel.org> +--- + drivers/gpu/drm/i915/gt/intel_workarounds.c | 7 ------- + 1 file changed, 7 deletions(-) + +diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c +index aae609d7d85d..6b5ab19a2ada 100644 +--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c ++++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c +@@ -621,13 +621,6 @@ static void gen12_ctx_workarounds_init(struct intel_engine_cs *engine, + FF_MODE2_GS_TIMER_MASK, + FF_MODE2_GS_TIMER_224, + 0, false); +- +- /* +- * Wa_14012131227:dg1 +- * Wa_1508744258:tgl,rkl,dg1,adl-s,adl-p +- */ +- wa_masked_en(wal, GEN7_COMMON_SLICE_CHICKEN1, +- GEN9_RHWO_OPTIMIZATION_DISABLE); + } + + static void dg1_ctx_workarounds_init(struct intel_engine_cs *engine, +-- +2.34.1 + diff --git a/debian/patches/debian/dfsg/drivers-net-appletalk-cops.patch b/debian/patches/debian/dfsg/drivers-net-appletalk-cops.patch index 44258800a..9a7884605 100644 --- a/debian/patches/debian/dfsg/drivers-net-appletalk-cops.patch +++ b/debian/patches/debian/dfsg/drivers-net-appletalk-cops.patch @@ -8,15 +8,19 @@ Forwarded: not-needed drivers/net/appletalk/Makefile | 1 - 2 files changed, 27 deletions(-) ---- a/drivers/net/appletalk/Kconfig -+++ b/drivers/net/appletalk/Kconfig -@@ -50,33 +50,6 @@ config LTPC +Index: linux/drivers/net/appletalk/Kconfig +=================================================================== +--- linux.orig/drivers/net/appletalk/Kconfig ++++ linux/drivers/net/appletalk/Kconfig +@@ -50,35 +50,6 @@ config LTPC This driver is experimental, which means that it may not work. See the file <file:Documentation/networking/device_drivers/appletalk/ltpc.rst>. -config COPS - tristate "COPS LocalTalk PC support" -- depends on DEV_APPLETALK && (ISA || EISA) +- depends on DEV_APPLETALK && ISA +- depends on NETDEVICES +- select NETDEV_LEGACY_INIT - help - This allows you to use COPS AppleTalk cards to connect to LocalTalk - networks. You also need version 1.3.3 or later of the netatalk @@ -44,8 +48,10 @@ Forwarded: not-needed config IPDDP tristate "Appletalk-IP driver support" depends on DEV_APPLETALK && ATALK ---- a/drivers/net/appletalk/Makefile -+++ b/drivers/net/appletalk/Makefile +Index: linux/drivers/net/appletalk/Makefile +=================================================================== +--- linux.orig/drivers/net/appletalk/Makefile ++++ linux/drivers/net/appletalk/Makefile @@ -4,5 +4,4 @@ # diff --git a/debian/patches/debian/export-symbols-needed-by-android-drivers.patch b/debian/patches/debian/export-symbols-needed-by-android-drivers.patch index 3bc2c83ae..1532d10bc 100644 --- a/debian/patches/debian/export-symbols-needed-by-android-drivers.patch +++ b/debian/patches/debian/export-symbols-needed-by-android-drivers.patch @@ -22,7 +22,7 @@ Export the currently un-exported symbols they depend on. --- a/kernel/fork.c +++ b/kernel/fork.c -@@ -1134,6 +1134,7 @@ +@@ -1153,6 +1153,7 @@ void mmput_async(struct mm_struct *mm) schedule_work(&mm->async_put_work); } } @@ -32,7 +32,7 @@ Export the currently un-exported symbols they depend on. /** --- a/kernel/sched/core.c +++ b/kernel/sched/core.c -@@ -5774,6 +5774,7 @@ +@@ -6931,6 +6931,7 @@ int can_nice(const struct task_struct *p return (nice_rlim <= task_rlimit(p, RLIMIT_NICE) || capable(CAP_SYS_NICE)); } @@ -42,7 +42,7 @@ Export the currently un-exported symbols they depend on. --- a/kernel/task_work.c +++ b/kernel/task_work.c -@@ -60,6 +60,7 @@ +@@ -60,6 +60,7 @@ int task_work_add(struct task_struct *ta return 0; } @@ -52,7 +52,7 @@ Export the currently un-exported symbols they depend on. * task_work_cancel_match - cancel a pending work added by task_work_add() --- a/mm/memory.c +++ b/mm/memory.c -@@ -1560,6 +1560,7 @@ +@@ -1655,6 +1655,7 @@ void zap_page_range(struct vm_area_struc mmu_notifier_invalidate_range_end(&range); tlb_finish_mmu(&tlb); } @@ -62,7 +62,7 @@ Export the currently un-exported symbols they depend on. * zap_page_range_single - remove user pages in a given range --- a/mm/shmem.c +++ b/mm/shmem.c -@@ -4231,6 +4231,7 @@ +@@ -4162,6 +4162,7 @@ int shmem_zero_setup(struct vm_area_stru return 0; } @@ -72,28 +72,28 @@ Export the currently un-exported symbols they depend on. * shmem_read_mapping_page_gfp - read into page cache, using specified page allocation flags. --- a/security/security.c +++ b/security/security.c -@@ -750,24 +750,28 @@ +@@ -751,24 +751,28 @@ int security_binder_set_context_mgr(cons { return call_int_hook(binder_set_context_mgr, 0, mgr); } +EXPORT_SYMBOL_GPL(security_binder_set_context_mgr); - int security_binder_transaction(struct task_struct *from, - struct task_struct *to) + int security_binder_transaction(const struct cred *from, + const struct cred *to) { return call_int_hook(binder_transaction, 0, from, to); } +EXPORT_SYMBOL_GPL(security_binder_transaction); - int security_binder_transfer_binder(struct task_struct *from, - struct task_struct *to) + int security_binder_transfer_binder(const struct cred *from, + const struct cred *to) { return call_int_hook(binder_transfer_binder, 0, from, to); } +EXPORT_SYMBOL_GPL(security_binder_transfer_binder); - int security_binder_transfer_file(struct task_struct *from, - struct task_struct *to, struct file *file) + int security_binder_transfer_file(const struct cred *from, + const struct cred *to, struct file *file) { return call_int_hook(binder_transfer_file, 0, from, to, file); } @@ -103,7 +103,7 @@ Export the currently un-exported symbols they depend on. { --- a/fs/file.c +++ b/fs/file.c -@@ -788,6 +788,7 @@ +@@ -804,6 +804,7 @@ int close_fd_get_file(unsigned int fd, s return ret; } diff --git a/debian/patches/debian/kbuild-look-for-module.lds-under-arch-directory-too.patch b/debian/patches/debian/kbuild-look-for-module.lds-under-arch-directory-too.patch index 13a2af254..7a87726ce 100644 --- a/debian/patches/debian/kbuild-look-for-module.lds-under-arch-directory-too.patch +++ b/debian/patches/debian/kbuild-look-for-module.lds-under-arch-directory-too.patch @@ -22,17 +22,17 @@ Therefore, we move module.lds under the arch build directory in rules.real and change Makefile.modfinal to look for it in both places. --- ---- a/scripts/Makefile.modfinal -+++ b/scripts/Makefile.modfinal -@@ -29,6 +29,7 @@ +Index: linux/scripts/Makefile.modfinal +=================================================================== +--- linux.orig/scripts/Makefile.modfinal ++++ linux/scripts/Makefile.modfinal +@@ -29,12 +29,13 @@ quiet_cmd_cc_o_c = CC [M] $@ $(call if_changed_dep,cc_o_c) ARCH_POSTLINK := $(wildcard $(srctree)/arch/$(SRCARCH)/Makefile.postlink) +ARCH_MODULE_LDS := $(word 1,$(wildcard scripts/module.lds arch/$(SRCARCH)/module.lds)) - ifdef CONFIG_LTO_CLANG - # With CONFIG_LTO_CLANG, reuse the object file we compiled for modpost to -@@ -53,7 +54,7 @@ + quiet_cmd_ld_ko_o = LD [M] $@ cmd_ld_ko_o += \ $(LD) -r $(KBUILD_LDFLAGS) \ $(KBUILD_LDFLAGS_MODULE) $(LDFLAGS_MODULE) \ @@ -41,12 +41,12 @@ rules.real and change Makefile.modfinal to look for it in both places. $(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true) quiet_cmd_btf_ko = BTF [M] $@ -@@ -74,7 +75,7 @@ +@@ -55,7 +56,7 @@ if_changed_except = $(if $(call newer_pr # Re-generate module BTFs if either module's .ko or vmlinux changed --$(modules): %.ko: %$(prelink-ext).o %.mod.o scripts/module.lds $(if $(KBUILD_BUILTIN),vmlinux) FORCE -+$(modules): %.ko: %$(prelink-ext).o %.mod.o $(ARCH_MODULE_LDS) $(if $(KBUILD_BUILTIN),vmlinux) FORCE +-$(modules): %.ko: %$(mod-prelink-ext).o %.mod.o scripts/module.lds $(if $(KBUILD_BUILTIN),vmlinux) FORCE ++$(modules): %.ko: %$(mod-prelink-ext).o %.mod.o $(ARCH_MODULE_LDS) $(if $(KBUILD_BUILTIN),vmlinux) FORCE +$(call if_changed_except,ld_ko_o,vmlinux) ifdef CONFIG_DEBUG_INFO_BTF_MODULES +$(if $(newer-prereqs),$(call cmd,btf_ko)) diff --git a/debian/patches/debian/tools-perf-install.patch b/debian/patches/debian/tools-perf-install.patch index b25bd2575..062348e7a 100644 --- a/debian/patches/debian/tools-perf-install.patch +++ b/debian/patches/debian/tools-perf-install.patch @@ -8,9 +8,11 @@ Forwarded: no tools/perf/Makefile.perf | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) ---- a/tools/perf/Makefile.perf -+++ b/tools/perf/Makefile.perf -@@ -952,8 +952,8 @@ endif +Index: linux/tools/perf/Makefile.perf +=================================================================== +--- linux.orig/tools/perf/Makefile.perf ++++ linux/tools/perf/Makefile.perf +@@ -975,8 +975,8 @@ endif ifndef NO_LIBPERL $(call QUIET_INSTALL, perl-scripts) \ $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/scripts/perl/Perf-Trace-Util/lib/Perf/Trace'; \ @@ -21,8 +23,8 @@ Forwarded: no $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/scripts/perl/bin'; \ $(INSTALL) scripts/perl/bin/* -t '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/scripts/perl/bin' endif -@@ -967,22 +967,22 @@ ifndef NO_LIBPYTHON - endif +@@ -993,22 +993,22 @@ endif + $(INSTALL) $(DLFILTERS) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/dlfilters'; $(call QUIET_INSTALL, perf_completion-script) \ $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(sysconfdir_SQ)/bash_completion.d'; \ - $(INSTALL) perf-completion.sh '$(DESTDIR_SQ)$(sysconfdir_SQ)/bash_completion.d/perf_$(VERSION)' diff --git a/debian/patches/debian/tools-perf-version.patch b/debian/patches/debian/tools-perf-version.patch index 2f1ba2a8a..1cb005833 100644 --- a/debian/patches/debian/tools-perf-version.patch +++ b/debian/patches/debian/tools-perf-version.patch @@ -17,7 +17,7 @@ Index: linux/tools/perf/Documentation/Makefile =================================================================== --- linux.orig/tools/perf/Documentation/Makefile +++ linux/tools/perf/Documentation/Makefile -@@ -195,14 +195,16 @@ ifdef missing_tools +@@ -190,14 +190,16 @@ ifdef missing_tools $(error "You need to install $(missing_tools) for man pages") endif @@ -44,18 +44,18 @@ Index: linux/tools/perf/Makefile.perf =================================================================== --- linux.orig/tools/perf/Makefile.perf +++ linux/tools/perf/Makefile.perf -@@ -922,25 +922,25 @@ endif +@@ -932,25 +932,25 @@ endif install-tools: all install-gtk $(call QUIET_INSTALL, binaries) \ $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(bindir_SQ)'; \ - $(INSTALL) $(OUTPUT)perf '$(DESTDIR_SQ)$(bindir_SQ)'; \ - $(LN) '$(DESTDIR_SQ)$(bindir_SQ)/perf' '$(DESTDIR_SQ)$(bindir_SQ)/trace'; \ - $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(includedir_SQ)/perf'; \ -- $(INSTALL) util/perf_dlfilter.h -t '$(DESTDIR_SQ)$(includedir_SQ)/perf' +- $(INSTALL) -m 644 include/perf/perf_dlfilter.h -t '$(DESTDIR_SQ)$(includedir_SQ)/perf' + $(INSTALL) $(OUTPUT)perf '$(DESTDIR_SQ)$(bindir_SQ)/perf_$(VERSION)'; \ -+ $(LN) '$(DESTDIR_SQ)$(bindir_SQ)/perf' '$(DESTDIR_SQ)$(bindir_SQ)/trace_$(VERSION)'; \ ++ $(LN) '$(DESTDIR_SQ)$(bindir_SQ)/perf_$(VERSION)' '$(DESTDIR_SQ)$(bindir_SQ)/trace_$(VERSION)'; \ + $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perf_include_instdir_SQ)'; \ -+ $(INSTALL) util/perf_dlfilter.h -t '$(DESTDIR_SQ)$(perf_include_instdir_SQ)' ++ $(INSTALL) -m 644 include/perf/perf_dlfilter.h -t '$(DESTDIR_SQ)$(perf_include_instdir_SQ)' + $(call QUIET_INSTALL, libexec) \ + $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perfexec_instdir_SQ)' ifndef NO_PERF_READ_VDSO32 @@ -78,8 +78,8 @@ Index: linux/tools/perf/Makefile.perf ifndef NO_LIBBPF $(call QUIET_INSTALL, bpf-headers) \ $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perf_include_instdir_SQ)/bpf'; \ -@@ -980,7 +980,7 @@ ifndef NO_LIBPYTHON - endif +@@ -993,7 +993,7 @@ endif + $(INSTALL) $(DLFILTERS) '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/dlfilters'; $(call QUIET_INSTALL, perf_completion-script) \ $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(sysconfdir_SQ)/bash_completion.d'; \ - $(INSTALL) perf-completion.sh '$(DESTDIR_SQ)$(sysconfdir_SQ)/bash_completion.d/perf' @@ -87,7 +87,7 @@ Index: linux/tools/perf/Makefile.perf $(call QUIET_INSTALL, perf-tip) \ $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(tip_instdir_SQ)'; \ $(INSTALL) Documentation/tips.txt -t '$(DESTDIR_SQ)$(tip_instdir_SQ)' -@@ -1006,7 +1006,7 @@ install-python_ext: +@@ -1019,7 +1019,7 @@ install-python_ext: # 'make install-doc' should call 'make -C Documentation install' $(INSTALL_DOC_TARGETS): @@ -100,7 +100,7 @@ Index: linux/tools/perf/util/Build =================================================================== --- linux.orig/tools/perf/util/Build +++ linux/tools/perf/util/Build -@@ -278,6 +278,7 @@ CFLAGS_hweight.o += -Wno-unused-pa +@@ -279,6 +279,7 @@ CFLAGS_hweight.o += -Wno-unused-pa CFLAGS_parse-events.o += -Wno-redundant-decls CFLAGS_expr.o += -Wno-redundant-decls CFLAGS_header.o += -include $(OUTPUT)PERF-VERSION-FILE diff --git a/debian/patches/features/all/db-mok-keyring/0001-MODSIGN-do-not-load-mok-when-secure-boot-disabled.patch b/debian/patches/features/all/db-mok-keyring/0001-MODSIGN-do-not-load-mok-when-secure-boot-disabled.patch index 3233fbf10..1eebc75e7 100644 --- a/debian/patches/features/all/db-mok-keyring/0001-MODSIGN-do-not-load-mok-when-secure-boot-disabled.patch +++ b/debian/patches/features/all/db-mok-keyring/0001-MODSIGN-do-not-load-mok-when-secure-boot-disabled.patch @@ -20,20 +20,21 @@ Signed-off-by: "Lee, Chun-Yi" <jlee@suse.com> [Salvatore Bonaccorso: Forward-ported to 5.10: Refresh for changes in 38a1f03aa240 ("integrity: Move import of MokListRT certs to a separate routine")] +[bwh: Adjust context after 5.13] --- security/integrity/platform_certs/load_uefi.c | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-) --- a/security/integrity/platform_certs/load_uefi.c +++ b/security/integrity/platform_certs/load_uefi.c -@@ -191,6 +191,10 @@ - kfree(mokx); +@@ -176,6 +176,10 @@ static int __init load_uefi_certs(void) + kfree(dbx); } + /* the MOK can not be trusted when secure boot is disabled */ + if (!efi_enabled(EFI_SECURE_BOOT)) + return 0; + - /* Load the MokListRT certs */ - rc = load_moklist_certs(); - + mokx = get_cert_list(L"MokListXRT", &mok_var, &mokxsize, &status); + if (!mokx) { + if (status == EFI_NOT_FOUND) diff --git a/debian/patches/features/all/db-mok-keyring/0002-MODSIGN-load-blacklist-from-MOKx.patch b/debian/patches/features/all/db-mok-keyring/0002-MODSIGN-load-blacklist-from-MOKx.patch deleted file mode 100644 index 328a8dff6..000000000 --- a/debian/patches/features/all/db-mok-keyring/0002-MODSIGN-load-blacklist-from-MOKx.patch +++ /dev/null @@ -1,135 +0,0 @@ -From: Ben Hutchings <benh@debian.org> -Date: Sun, 15 Nov 2020 01:01:03 +0000 -Subject: MODSIGN: load blacklist from MOKx - -Loosely based on a patch by "Lee, Chun-Yi" <joeyli.kernel@gmail.com> -at <https://lore.kernel.org/patchwork/patch/933177/> which was later -rebased by Luca Boccassi. - -This patch adds the logic to load the blacklisted hash and -certificates from MOKx which is maintained by shim bootloader. - -Since MOK list loading became more complicated in 5.10 and was moved -to load_moklist_certs(), add parameters to that and call it once for -each of MokListRT and MokListXRT. - -Signed-off-by: Ben Hutchings <benh@debian.org> ---- - security/integrity/platform_certs/load_uefi.c | 47 +++++++++++++++++--------- - 1 file changed, 31 insertions(+), 16 deletions(-) - ---- a/security/integrity/platform_certs/load_uefi.c -+++ b/security/integrity/platform_certs/load_uefi.c -@@ -76,49 +76,59 @@ - * - * Return: Status - */ --static int __init load_moklist_certs(void) -+static int __init -+load_moklist_certs(const char *list_name, efi_char16_t *list_name_w, -+ efi_element_handler_t (*get_handler)(const efi_guid_t *)) - { - struct efi_mokvar_table_entry *mokvar_entry; - efi_guid_t mok_var = EFI_SHIM_LOCK_GUID; - void *mok; - unsigned long moksize; - efi_status_t status; -+ char mokvar_list_desc[40]; -+ char efivar_list_desc[20]; - int rc; - -+ snprintf(mokvar_list_desc, sizeof(mokvar_list_desc), -+ "UEFI:%s (MOKvar table)", list_name); -+ snprintf(efivar_list_desc, sizeof(efivar_list_desc), -+ "UEFI:%s", list_name); -+ - /* First try to load certs from the EFI MOKvar config table. - * It's not an error if the MOKvar config table doesn't exist -- * or the MokListRT entry is not found in it. -+ * or the MokList(X)RT entry is not found in it. - */ -- mokvar_entry = efi_mokvar_entry_find("MokListRT"); -+ mokvar_entry = efi_mokvar_entry_find(list_name); - if (mokvar_entry) { -- rc = parse_efi_signature_list("UEFI:MokListRT (MOKvar table)", -+ rc = parse_efi_signature_list(mokvar_list_desc, - mokvar_entry->data, - mokvar_entry->data_size, -- get_handler_for_db); -+ get_handler); - /* All done if that worked. */ - if (!rc) - return rc; - -- pr_err("Couldn't parse MokListRT signatures from EFI MOKvar config table: %d\n", -- rc); -+ pr_err("Couldn't parse %s signatures from EFI MOKvar config table: %d\n", -+ list_name, rc); - } - -- /* Get MokListRT. It might not exist, so it isn't an error -+ /* Get MokList(X)RT. It might not exist, so it isn't an error - * if we can't get it. - */ -- mok = get_cert_list(L"MokListRT", &mok_var, &moksize, &status); -+ mok = get_cert_list(list_name_w, &mok_var, &moksize, &status); - if (mok) { -- rc = parse_efi_signature_list("UEFI:MokListRT", -- mok, moksize, get_handler_for_db); -+ rc = parse_efi_signature_list(efivar_list_desc, -+ mok, moksize, get_handler); - kfree(mok); - if (rc) -- pr_err("Couldn't parse MokListRT signatures: %d\n", rc); -+ pr_err("Couldn't parse %s signatures: %d\n", -+ list_name, rc); - return rc; - } - if (status == EFI_NOT_FOUND) -- pr_debug("MokListRT variable wasn't found\n"); -+ pr_debug("%s variable wasn't found\n", list_name); - else -- pr_info("Couldn't get UEFI MokListRT\n"); -+ pr_info("Couldn't get UEFI %s\n", list_name); - return 0; - } - -@@ -176,27 +186,17 @@ - kfree(dbx); - } - -- mokx = get_cert_list(L"MokListXRT", &mok_var, &mokxsize, &status); -- if (!mokx) { -- if (status == EFI_NOT_FOUND) -- pr_debug("mokx variable wasn't found\n"); -- else -- pr_info("Couldn't get mokx list\n"); -- } else { -- rc = parse_efi_signature_list("UEFI:MokListXRT", -- mokx, mokxsize, -- get_handler_for_dbx); -- if (rc) -- pr_err("Couldn't parse mokx signatures %d\n", rc); -- kfree(mokx); -- } -- -- /* the MOK can not be trusted when secure boot is disabled */ -- if (!efi_enabled(EFI_SECURE_BOOT)) -- return 0; -- -- /* Load the MokListRT certs */ -- rc = load_moklist_certs(); -+ /* the MOK and MOKx can not be trusted when secure boot is disabled */ -+ if (!efi_enabled(EFI_SECURE_BOOT)) -+ return 0; -+ -+ /* Load the MokListRT certs */ -+ rc = load_moklist_certs("MokListRT", L"MokListRT", -+ get_handler_for_db); -+ if (rc) -+ return rc; -+ rc = load_moklist_certs("MokListXRT", L"MokListXRT", -+ get_handler_for_dbx); - - return rc; - } diff --git a/debian/patches/features/all/db-mok-keyring/0004-MODSIGN-check-the-attributes-of-db-and-mok.patch b/debian/patches/features/all/db-mok-keyring/0004-MODSIGN-check-the-attributes-of-db-and-mok.patch index 00a408656..e3e71d05e 100644 --- a/debian/patches/features/all/db-mok-keyring/0004-MODSIGN-check-the-attributes-of-db-and-mok.patch +++ b/debian/patches/features/all/db-mok-keyring/0004-MODSIGN-check-the-attributes-of-db-and-mok.patch @@ -26,6 +26,7 @@ Signed-off-by: "Lee, Chun-Yi" <jlee@suse.com> - Adjust filename, context] [bwh: Forward-ported to 5.10: MokListRT and MokListXRT are now both loaded through a single code path.] +[bwh: Forward-ported to 5.13: No they aren't] --- security/integrity/platform_certs/load_uefi.c | 27 ++++++++++++++----- 1 file changed, 21 insertions(+), 6 deletions(-) @@ -71,17 +72,17 @@ Signed-off-by: "Lee, Chun-Yi" <jlee@suse.com> *size = lsize; return db; } -@@ -115,7 +126,8 @@ load_moklist_certs(const char *list_name - /* Get MokList(X)RT. It might not exist, so it isn't an error +@@ -106,7 +117,8 @@ static int __init load_moklist_certs(voi + /* Get MokListRT. It might not exist, so it isn't an error * if we can't get it. */ -- mok = get_cert_list(list_name_w, &mok_var, &moksize, &status); -+ mok = get_cert_list(list_name_w, &mok_var, &moksize, &status, +- mok = get_cert_list(L"MokListRT", &mok_var, &moksize, &status); ++ mok = get_cert_list(L"MokListRT", &mok_var, &moksize, &status, + 0, EFI_VARIABLE_NON_VOLATILE); if (mok) { - rc = parse_efi_signature_list(efivar_list_desc, - mok, moksize, get_handler); -@@ -154,7 +166,8 @@ static int __init load_uefi_certs(void) + rc = parse_efi_signature_list("UEFI:MokListRT", + mok, moksize, get_handler_for_db); +@@ -145,7 +157,8 @@ static int __init load_uefi_certs(void) * if we can't get them. */ if (!uefi_check_ignore_db()) { @@ -91,7 +92,7 @@ Signed-off-by: "Lee, Chun-Yi" <jlee@suse.com> if (!db) { if (status == EFI_NOT_FOUND) pr_debug("MODSIGN: db variable wasn't found\n"); -@@ -170,7 +183,8 @@ static int __init load_uefi_certs(void) +@@ -161,7 +174,8 @@ static int __init load_uefi_certs(void) } } @@ -101,3 +102,13 @@ Signed-off-by: "Lee, Chun-Yi" <jlee@suse.com> if (!dbx) { if (status == EFI_NOT_FOUND) pr_debug("dbx variable wasn't found\n"); +@@ -180,7 +194,8 @@ static int __init load_uefi_certs(void) + if (!efi_enabled(EFI_SECURE_BOOT)) + return 0; + +- mokx = get_cert_list(L"MokListXRT", &mok_var, &mokxsize, &status); ++ mokx = get_cert_list(L"MokListXRT", &mok_var, &mokxsize, &status, ++ 0, EFI_VARIABLE_NON_VOLATILE); + if (!mokx) { + if (status == EFI_NOT_FOUND) + pr_debug("mokx variable wasn't found\n"); diff --git a/debian/patches/features/arm64/arm64-dts-rockchip-disable-USB-type-c-DisplayPort.patch b/debian/patches/features/arm64/arm64-dts-rockchip-disable-USB-type-c-DisplayPort.patch deleted file mode 100644 index 51ab544d5..000000000 --- a/debian/patches/features/arm64/arm64-dts-rockchip-disable-USB-type-c-DisplayPort.patch +++ /dev/null @@ -1,40 +0,0 @@ -From 3a75704e99a118f2d8a4d70f07781558bde85770 Mon Sep 17 00:00:00 2001 -From: Jian-Hong Pan <jhp@endlessos.org> -Date: Thu, 24 Sep 2020 14:30:43 +0800 -Subject: [PATCH] arm64: dts: rockchip: disable USB type-c DisplayPort - -The cdn-dp sub driver probes the device failed on PINEBOOK Pro. - -kernel: cdn-dp fec00000.dp: [drm:cdn_dp_probe [rockchipdrm]] *ERROR* missing extcon or phy -kernel: cdn-dp: probe of fec00000.dp failed with error -22 - -Then, the device halts all of the DRM related device jobs. For example, -the operations: vop_component_ops, vop_component_ops and -rockchip_dp_component_ops cannot be bound to corresponding devices. So, -Xorg cannot find the correct DRM device. - -The USB type-C DisplayPort does not work for now. So, disable the -DisplayPort node until the type-C phy work has been done. - -Link: https://patchwork.kernel.org/patch/11794141/#23639877 -Signed-off-by: Jian-Hong Pan <jhp@endlessos.org> ---- - arch/arm64/boot/dts/rockchip/rk3399-pinebook-pro.dts | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/arch/arm64/boot/dts/rockchip/rk3399-pinebook-pro.dts b/arch/arm64/boot/dts/rockchip/rk3399-pinebook-pro.dts -index 219b7507a10f..45769764425d 100644 ---- a/arch/arm64/boot/dts/rockchip/rk3399-pinebook-pro.dts -+++ b/arch/arm64/boot/dts/rockchip/rk3399-pinebook-pro.dts -@@ -380,7 +380,7 @@ - }; - - &cdn_dp { -- status = "okay"; -+ status = "disabled"; - }; - - &cpu_b0 { --- -2.30.2 - diff --git a/debian/patches/features/x86/intel-iommu-add-kconfig-option-to-exclude-igpu-by-default.patch b/debian/patches/features/x86/intel-iommu-add-kconfig-option-to-exclude-igpu-by-default.patch index 692854d51..01602a76e 100644 --- a/debian/patches/features/x86/intel-iommu-add-kconfig-option-to-exclude-igpu-by-default.patch +++ b/debian/patches/features/x86/intel-iommu-add-kconfig-option-to-exclude-igpu-by-default.patch @@ -13,18 +13,17 @@ corresponding to "on", "off", and "on,intgpu_off". Signed-off-by: Ben Hutchings <ben@decadent.org.uk> --- ---- a/drivers/iommu/intel/Kconfig -+++ b/drivers/iommu/intel/Kconfig -@@ -46,14 +46,28 @@ +Index: linux/drivers/iommu/intel/Kconfig +=================================================================== +--- linux.orig/drivers/iommu/intel/Kconfig ++++ linux/drivers/iommu/intel/Kconfig +@@ -54,14 +54,25 @@ config INTEL_IOMMU_SVM to access DMA resources through process address space by means of a Process Address Space ID (PASID). -config INTEL_IOMMU_DEFAULT_ON -- def_bool y -- prompt "Enable Intel DMA Remapping Devices by default" -- depends on INTEL_IOMMU -+if INTEL_IOMMU -+ +- bool "Enable Intel DMA Remapping Devices by default" +- default y +choice + prompt "Default state of Intel DMA Remapping Devices" + default INTEL_IOMMU_DEFAULT_ON @@ -44,38 +43,34 @@ Signed-off-by: Ben Hutchings <ben@decadent.org.uk> + +config INTEL_IOMMU_DEFAULT_OFF + bool "Disable" -+ -+endchoice -+ -+endif ++endchoice ++ config INTEL_IOMMU_BROKEN_GFX_WA bool "Workaround broken graphics drivers (going away soon)" ---- a/drivers/iommu/intel/iommu.c -+++ b/drivers/iommu/intel/iommu.c -@@ -347,11 +347,7 @@ + depends on BROKEN && X86 +Index: linux/drivers/iommu/intel/iommu.c +=================================================================== +--- linux.orig/drivers/iommu/intel/iommu.c ++++ linux/drivers/iommu/intel/iommu.c +@@ -331,14 +331,14 @@ static int intel_iommu_attach_device(str static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova); --#ifdef CONFIG_INTEL_IOMMU_DEFAULT_ON --int dmar_disabled = 0; --#else --int dmar_disabled = 1; --#endif /* CONFIG_INTEL_IOMMU_DEFAULT_ON */ +-int dmar_disabled = !IS_ENABLED(CONFIG_INTEL_IOMMU_DEFAULT_ON); +int dmar_disabled = IS_ENABLED(CONFIG_INTEL_IOMMU_DEFAULT_OFF); + int intel_iommu_sm = IS_ENABLED(CONFIG_INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON); - #ifdef CONFIG_INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON - int intel_iommu_sm = 1; -@@ -363,7 +359,7 @@ + int intel_iommu_enabled = 0; EXPORT_SYMBOL_GPL(intel_iommu_enabled); static int dmar_map_gfx = 1; -static int dmar_map_intgpu = 1; +static int dmar_map_intgpu = IS_ENABLED(CONFIG_INTEL_IOMMU_DEFAULT_ON); - static int intel_iommu_strict; static int intel_iommu_superpage = 1; static int iommu_identity_mapping; -@@ -446,6 +442,7 @@ + static int iommu_skip_te_disable; +@@ -420,6 +420,7 @@ static int __init intel_iommu_setup(char while (*str) { if (!strncmp(str, "on", 2)) { dmar_disabled = 0; diff --git a/debian/patches/features/x86/intel-iommu-add-option-to-exclude-integrated-gpu-only.patch b/debian/patches/features/x86/intel-iommu-add-option-to-exclude-integrated-gpu-only.patch index 5fa8cfd80..2b4cc3a3c 100644 --- a/debian/patches/features/x86/intel-iommu-add-option-to-exclude-integrated-gpu-only.patch +++ b/debian/patches/features/x86/intel-iommu-add-option-to-exclude-integrated-gpu-only.patch @@ -20,20 +20,24 @@ Signed-off-by: Ben Hutchings <ben@decadent.org.uk> drivers/iommu/intel/iommu.c | 14 ++++++++++++++ 2 files changed, 16 insertions(+) ---- a/Documentation/admin-guide/kernel-parameters.txt -+++ b/Documentation/admin-guide/kernel-parameters.txt -@@ -1897,6 +1897,8 @@ +Index: linux/Documentation/admin-guide/kernel-parameters.txt +=================================================================== +--- linux.orig/Documentation/admin-guide/kernel-parameters.txt ++++ linux/Documentation/admin-guide/kernel-parameters.txt +@@ -1959,6 +1959,8 @@ bypassed by not enabling DMAR with this option. In this case, gfx device will use physical address for DMA. + intgpu_off [Default Off] + Bypass the DMAR unit for an integrated GPU only. strict [Default Off] - With this option on every unmap_single operation will - result in a hardware IOTLB flush operation as opposed ---- a/drivers/iommu/intel/iommu.c -+++ b/drivers/iommu/intel/iommu.c -@@ -53,6 +53,9 @@ + Deprecated, equivalent to iommu.strict=1. + sp_off [Default Off] +Index: linux/drivers/iommu/intel/iommu.c +=================================================================== +--- linux.orig/drivers/iommu/intel/iommu.c ++++ linux/drivers/iommu/intel/iommu.c +@@ -55,6 +55,9 @@ #define CONTEXT_SIZE VTD_PAGE_SIZE #define IS_GFX_DEVICE(pdev) ((pdev->class >> 16) == PCI_BASE_CLASS_DISPLAY) @@ -43,15 +47,14 @@ Signed-off-by: Ben Hutchings <ben@decadent.org.uk> #define IS_USB_DEVICE(pdev) ((pdev->class >> 8) == PCI_CLASS_SERIAL_USB) #define IS_ISA_DEVICE(pdev) ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) #define IS_AZALIA(pdev) ((pdev)->vendor == 0x8086 && (pdev)->device == 0x3a3e) -@@ -360,6 +363,7 @@ +@@ -335,12 +338,14 @@ int intel_iommu_enabled = 0; EXPORT_SYMBOL_GPL(intel_iommu_enabled); static int dmar_map_gfx = 1; +static int dmar_map_intgpu = 1; - static int intel_iommu_strict; static int intel_iommu_superpage = 1; static int iommu_identity_mapping; -@@ -367,6 +371,7 @@ + static int iommu_skip_te_disable; #define IDENTMAP_GFX 2 #define IDENTMAP_AZALIA 4 @@ -59,7 +62,7 @@ Signed-off-by: Ben Hutchings <ben@decadent.org.uk> int intel_iommu_gfx_mapped; EXPORT_SYMBOL_GPL(intel_iommu_gfx_mapped); -@@ -449,6 +454,9 @@ +@@ -423,6 +428,9 @@ static int __init intel_iommu_setup(char } else if (!strncmp(str, "igfx_off", 8)) { dmar_map_gfx = 0; pr_info("Disable GFX device mapping\n"); @@ -69,7 +72,7 @@ Signed-off-by: Ben Hutchings <ben@decadent.org.uk> } else if (!strncmp(str, "forcedac", 8)) { pr_warn("intel_iommu=forcedac deprecated; use iommu.forcedac instead\n"); iommu_dma_forcedac = true; -@@ -2897,6 +2905,9 @@ +@@ -2877,6 +2885,9 @@ static int device_def_domain_type(struct if ((iommu_identity_mapping & IDENTMAP_GFX) && IS_GFX_DEVICE(pdev)) return IOMMU_DOMAIN_IDENTITY; @@ -79,7 +82,7 @@ Signed-off-by: Ben Hutchings <ben@decadent.org.uk> } return 0; -@@ -3334,6 +3345,9 @@ +@@ -3313,6 +3324,9 @@ static int __init init_dmars(void) if (!dmar_map_gfx) iommu_identity_mapping |= IDENTMAP_GFX; diff --git a/debian/patches/series b/debian/patches/series index dd272e2a3..6f5874334 100644 --- a/debian/patches/series +++ b/debian/patches/series @@ -76,22 +76,19 @@ bugfix/arm/arm-mm-export-__sync_icache_dcache-for-xen-privcmd.patch bugfix/powerpc/powerpc-boot-fix-missing-crc32poly.h-when-building-with-kernel_xz.patch bugfix/arm64/arm64-acpi-Add-fixup-for-HPE-m400-quirks.patch bugfix/x86/x86-32-disable-3dnow-in-generic-config.patch -bugfix/mipsel/bpf-mips-Validate-conditional-branch-offsets.patch -bugfix/arm/ARM-dts-sun7i-A20-olinuxino-lime2-Fix-ethernet-phy-m.patch +bugfix/x86/Revert-drm-i915-Implement-Wa_1508744258.patch # Arch features features/arm64/arm64-dts-rockchip-Add-support-for-two-PWM-fans-on-h.patch features/arm64/arm64-dts-rockchip-Add-support-for-PCIe-on-helios64.patch -features/arm64/arm64-dts-rockchip-disable-USB-type-c-DisplayPort.patch features/x86/x86-memtest-WARN-if-bad-RAM-found.patch features/x86/x86-make-x32-syscall-support-conditional.patch # Miscellaneous bug fixes bugfix/all/disable-some-marvell-phys.patch bugfix/all/fs-add-module_softdep-declarations-for-hard-coded-cr.patch -bugfix/all/partially-revert-usb-kconfig-using-select-for-usb_co.patch -bugfix/all/HID-apple-Add-missing-scan-code-event-for-keys-handl.patch -bugfix/all/ext4-limit-the-number-of-blocks-in-one-ADD_RANGE-TLV.patch +bugfix/all/fuse-release-pipe-buf-after-last-use.patch +bugfix/all/nfsd-fix-use-after-free-due-to-delegation-race.patch # Miscellaneous features @@ -103,7 +100,6 @@ features/all/lockdown/arm64-add-kernel-config-option-to-lock-down-when.patch # Improve integrity platform keyring for kernel modules verification features/all/db-mok-keyring/0001-MODSIGN-do-not-load-mok-when-secure-boot-disabled.patch -features/all/db-mok-keyring/0002-MODSIGN-load-blacklist-from-MOKx.patch features/all/db-mok-keyring/0003-MODSIGN-checking-the-blacklisted-hash-before-loading-a-kernel-module.patch features/all/db-mok-keyring/0004-MODSIGN-check-the-attributes-of-db-and-mok.patch features/all/db-mok-keyring/modsign-make-shash-allocation-failure-fatal.patch @@ -112,6 +108,14 @@ features/all/db-mok-keyring/KEYS-Make-use-of-platform-keyring-for-module-signatu # Security fixes debian/i386-686-pae-pci-set-pci-nobios-by-default.patch debian/ntfs-mark-it-as-broken.patch +bugfix/all/atlantic-Fix-OOB-read-and-write-in-hw_atl_utils_fw_r.patch +bugfix/all/fget-check-that-the-fd-still-exists-after-getting-a-.patch +bugfix/all/USB-gadget-detect-too-big-endpoint-0-requests.patch +bugfix/all/USB-gadget-zero-allocate-endpoint-0-buffers.patch +bugfix/all/bpf-fix-kernel-address-leakage-in-atomic-fetch.patch +bugfix/all/bpf-fix-signed-bounds-propagation-after-mov32.patch +bugfix/all/bpf-make-32-64-bounds-propagation-slightly-more-robust.patch +bugfix/all/bpf-fix-kernel-address-leakage-in-atomic-cmpxchg-s-r0-aux-reg.patch # Fix exported symbol versions bugfix/all/module-disable-matching-missing-version-crc.patch @@ -128,5 +132,6 @@ bugfix/all/cpupower-fix-checks-for-cpu-existence.patch bugfix/all/tools-perf-pmu-events-fix-reproducibility.patch bugfix/all/bpftool-fix-version-string-in-recursive-builds.patch bugfix/all/tools-include-uapi-fix-errno.h.patch +bugfix/all/perf-srcline-Use-long-running-addr2line-per-DSO.patch # ABI maintenance diff --git a/debian/rules.real b/debian/rules.real index ee2f459bf..e15b11a0b 100644 --- a/debian/rules.real +++ b/debian/rules.real @@ -377,7 +377,6 @@ install-libc-dev_$(ARCH): dh_prep rm -rf '$(DIR)' mkdir -p $(DIR) - +$(MAKE_CLEAN) O='$(CURDIR)/$(DIR)' headers_check ARCH=$(KERNEL_ARCH) +$(MAKE_CLEAN) O='$(CURDIR)/$(DIR)' headers_install ARCH=$(KERNEL_ARCH) INSTALL_HDR_PATH='$(CURDIR)'/$(OUT_DIR) rm -rf $(OUT_DIR)/include/drm $(OUT_DIR)/include/scsi diff --git a/debian/templates/control.extra.in b/debian/templates/control.extra.in index ef9015127..d176e8e51 100644 --- a/debian/templates/control.extra.in +++ b/debian/templates/control.extra.in @@ -1,24 +1,24 @@ -Package: linux-compiler-gcc-10-arm +Package: linux-compiler-gcc-11-arm Build-Profiles: <!stage1> -Depends: gcc-10, ${misc:Depends} +Depends: gcc-11, ${misc:Depends} Architecture: armel armhf Multi-Arch: foreign Description: Compiler for Linux on ARM (meta-package) This package depends on GCC of the appropriate version and architecture for Linux on armel and armhf. -Package: linux-compiler-gcc-10-s390 +Package: linux-compiler-gcc-11-s390 Build-Profiles: <!stage1> -Depends: gcc-10, ${misc:Depends} +Depends: gcc-11, ${misc:Depends} Architecture: s390 s390x Multi-Arch: foreign Description: Compiler for Linux on IBM zSeries (meta-package) This package depends on GCC of the appropriate version and architecture for Linux on s390 and s390x. -Package: linux-compiler-gcc-10-x86 +Package: linux-compiler-gcc-11-x86 Build-Profiles: <!stage1> -Depends: gcc-10, ${misc:Depends} +Depends: gcc-11, ${misc:Depends} Architecture: amd64 i386 x32 Multi-Arch: foreign Description: Compiler for Linux on x86 (meta-package) |