The limit of power extraction by a device which makes use of constructive interference, i.e. local blockage, is investigated theoretically. The device is modelled using actuator disc theory in which we allow the device to be split into arrays and these then into sub-arrays an arbitrary number of times so as to construct an $n$-level multi-scale device in which the original device undergoes $n-1$ sub-divisions. The alternative physical interpretation of the problem is a planar system of arrayed turbines in which groups of turbines are homogeneously arrayed at the smallest $n\mathrm {th}$ scale, and then these groups are homogeneously spaced relative to each other at the next smallest $n-1\mathrm {th}$ scale, with this pattern repeating at all subsequent larger scales. The scale-separation idea of Nishino & Willden (J. Fluid. Mech., vol. 708, 2012b, pp. 596–606) is employed, which assumes mixing within a sub-array occurs faster than mixing of the by-pass flow around that sub-array, so that in the $n$-scale device mixing occurs from the inner scale to the outermost scale in that order. We investigate the behaviour of an arbitrary level multi-scale device, and determine the arrangement of actuator discs ($n\mathrm {th}$ level devices) which maximises the power coefficient (ratio of power extracted to undisturbed kinetic energy flux through the net disc frontal area). We find that this optimal arrangement is close to fractal, and fractal arrangements give similar results. With the device placed in an infinitely wide channel, i.e. zero global blockage, we find that the optimum power coefficient tends to unity as the number of device scales tends to infinity, a 27/16 increase over the Lanchester–Betz limit of $0.593$. For devices in finite width channels, i.e. non-zero global blockage, similar observations can be made with further uplift in the maximum power coefficient. We discuss the fluid mechanics of this energy extraction process and examine the scale distribution of thrust and wake velocity coefficients. Numerical demonstration of performance uplift due to multi-scale dynamics is also provided. We demonstrate that bypass flow remixing and ensuing energy losses increase the device power coefficient above the limits for single devices, so that although the power coefficient can be made to increase, this is at the expense of the overall efficiency of energy extraction which decreases as wake-scale remixing losses necessarily rise. For multi-scale devices in finite overall blockage two effects act to increase extractable power; an overall streamwise pressure gradient associated with finite blockage, and wake pressure recoveries associated with bypass-scale remixing.