Sensor size: myths and reality

Digital system cameras currently use mainly three different sizes/formats: full-frame (36 x 24 mm, with slight variations), APS-C (approximately 24 x 18 mm, with more significant variations), and Micro 4/3 (17.3 x 13 mm). In itself, the "full frame" denomination, if applied to solid-state sensors, is misleading: Each sensor size gives you a full image, regardless of its size. Although a Micro 4/3 sensor is roughly one-quarter the area of a full-frame sensor, no Micro 4/3 camera will give you just one-quarter of the image you framed in the viewfinder. Full frame is only full in the sense that its sensors are the same size as the frames of 35 mm still-image film cameras.

In addition to the many "full-frame" film cameras using 35 mm film (which at the time were actually called "small format" cameras), in the 1960s there were also a few "half-frame" cameras using the same film, which produced 24 x 18 frames. Each frame was in portrait aspect when viewing a horizontal strip of film, so most of these half-frame cameras had to be held vertical in order to shoot in landscape orientation and horizontal to shoot in portrait orientation (just the opposite of full-frame cameras).

The two advantages of half-frame cameras were:
- The cost of film (especially the more expensive color film) was cut in half.
- Initially at least, half-frame cameras were much smaller than full-frame ones (although, after a while, equally small full-frame cameras were introduced by Olympus and Rollei).

Half-frame cameras may well be be where the insistence on "full frame" as better than "less-than-full-frame" originated. Of course, when using the same film type, half-frame cameras could record only a lesser amount of detail per frame. It did not make a difference in small prints from a neighborhood one-hour photolab, but it was visible in larger prints produced by better enlargers. Thus, in an era when 6 by 6 cm film was "professional" and 36 x 24 mm was already looked at with disdain by many professional photographers, half frame was simply rejected by professional and amateur photographers alike, and was briefly popular only among occasional photographers.

The photographic industry often seems to promote full frame digital cameras as something that a professional photographer should choose unquestioningly. This is far from the truth: many professional photographers use APS-C and Micro 4/3 cameras because these sensor formats offer desirable characteristics, like savings in size and weight of lenses for these formats. If professionalism is measured by sensor size, then every photographer aiming for top-notch professional recognition should consider using a camera like the LSST, with its 640 mm wide sensor plane of 3,200 Mpixels and a front lens diameter of 1,650 mm. Don't let anyone tell you that such a camera would be impractical to use: imagine the advantage of photographing every marriage reception and sports event within a 20 mile radius without leaving the roof of your purpose-built studio, and your camera being big enough to be visible to potential customers and competitors alike from miles away.

Much has been said about the properties of different sensor sizes. Some of it is true, but much is urban legend, hearsay, or simply incorrect. To be fair, comparing cameras with different sensor sizes is fraught with pitfalls, and in some respects it is just impossible to truthfully state what is "equivalent" in different formats, because if you accept some properties as being equivalent among formats, you always end up with other properties being clearly non-equivalent. In other words, you will never be able to compare apples and oranges by regarding them as equivalent to each other.

The following discussion compares Micro 4/3 with full frame, because they are the extremes of the size range I chose to discuss. I am not specifically discussing APS-C, but the discussion is indirectly applicable to this format as well.

Micro 4/3, and 4/3 before it, use the latest sensor size to become very popular in system cameras. Other sensor sizes than the three mentioned above have been used, both larger and smaller, but most of them are now only footnotes in camera history (for example, the 13.2 x 8.8 mm Nikon 1 format, used in one of the first lines of mirrorless system cameras). Of the three formats discussed here, Micro 4/3, being both the newest and smallest, has been subjected to the fiercest criticism.

Pixel count of Micro 4/3 sensors

The "limited" pixel count of Micro 4/3 cameras (around 20 Mpixel for the past six years) is often presented as an argument for preferring cameras with larger sensors and a higher pixel count, usually full frame. I own and use both Micro 4/3 and full-frame cameras, the latter up to 42 Mpixel. First of all, let me state the obvious: A 42 Mpixel sensor is useless without lenses and camera-handling technique that match the sensor resolution.

Full-frame lenses capable of 42 Mpixel resolution are not cheap. When shooting with long telephoto lenses, full-frame requires a focal length about twice as long as Micro 4/3 in order to record a similar field of view. Thus, a Micro 4/3 photographer carries in the field a 300 mm f/4 in a 2.5 Kg backpack (easily carried for a whole day), and shoots hand-held in normal illumination conditions. This lens focuses as close as 1.2 m and provides at this distance a field of view of 72 x 54 mm, well into close-up photography.

Meanwhile, a full-frame photographer, to achieve the same angle of view and exposure time, must carry a 600 mm f/4 (in the best case, three times the weight of the 300 mm) in a heavy backpack, and in most cases also a suitably large tripod because shooting with the 600 mm hand-held or on a monopod is physically too taxing. This 600 mm typically focuses only as close as 4.5 m, and its minimum field of view is 240 x 160 mm, i.e. almost 10 times the subject area of the 300 mm at its closest focus. The 600 mm lens also costs multiple times more than the 300 mm, and is about as inconspicuous in a public place as a shoulder-mounted anti-tank rocket launcher. However, let's ignore these key factors in the following discussion.

Aside from printing poster-sized images that must be viewed at close range, or cropping away three-quarters of the image area in post-processing, I cannot really see a practical need for an actual image resolution exceeding about 20-25 Mpixel. 20 Mpixel, with high-quality lenses and good technique, provides far more image detail than required by 95% of the commercial uses of images. Additionally, for much of the remaining 5% of uses, sensor-shift multi-exposure techniques can effectively quadruple the pixel count (with static subjects), and AI-based post-processing resolution enhancements are beginning to produce convincing results. Therefore, I do not agree that a Micro 4/3 camera of good quality, like the OM-1, is intrinsically inadequate as a professional tool because of its "low" sensor pixel count.

Light gathering properties of different sensor sizes

Much has been written about the light-gathering properties of sensors of different sizes. It is often stated that a larger sensor collects more light, which in turn provides a better signal-to-noise ratio in low light and dark portions of the image. Much as this seems self-evident, I am going to argue that this is not always true or significant.

In this discussion, one must be careful not to compare apples and oranges (and this mistake is easily made when comparing sensors of different sizes). Therefore, we should start by comparing sensors of different sizes, but with a comparable pixel count. Needless to say, a full-frame sensor with 80 Mpixels has no light-collecting advantage per pixel over a 20 Mpixels Micro 4/3 sensor. The pixels on both sensors have the same size. Therefore, at the same lens aperture, subject illumination intensity and exposure time, each pixel collects the same amount of photons.

Let's therefore assume a 20 Mpixel sensor resolution and a Bayer sensor architecture in the following discussion. The Micro 4/3 OM System OM-1 uses a sensor with quad-pixel structure and 80 million photodiodes, but, when not using its pixel-shift modes, at present outputs 20 Mpixel images like preceding Olympus cameras (this could potentially change in future firmware updates). We should also compare sensors designed and produced at roughly the same time. It would be unfair to compare a modern camera with a 17 year old Nikon D70s. We should also compare sensors at the same ISO. We can assume ISO 200, since this is the most common base sensitivity for the sensor of a system camera.

Subject at infinity

Let's examine first the light-gathering properties of a camera for a subject located at infinity (or at a distance equal to several times the lens focal length).

At the same lens aperture (let's say f/2.8) and in the conditions outlined above, the intensity of the light reaching the sensor, as well as the proper exposure time for a given scene, are the same regardless of sensor size and lens focal length. Intensity expresses the number of photons per unit of sensor area (not per pixel) and per unit of time. Thus, the larger area of a full-frame sensor receives four times the number of photons as a Micro 4/3 sensor. At the same sensor pixel count, the same difference applies to the light received by an individual pixel on either sensor: The physically larger pixel collects four times more photons than the smaller one, all other things remaining the same. This corresponds to a two-stops advantage for the larger sensor, which may make a difference (although it is not necessarily visible in average shooting conditions), and requires a further discussion.

ISO and noise

It is reasonable to expect that the electron well underneath a sensor photodiode is designed with the proper size to receive the number of electrons collected by the photodiode during a normal exposure. Therefore, an electron well in a full-frame sensor should be able to contain four times the number of electrons as an electron well in a Micro 4/3 sensor.

The quantum efficiency (i.e. the portion of photons converted to electrons) is 95% in green light for very high-quality sensors, and not far from 90% in consumer cameras. No significant differences in quantum efficiency should be expected among different modern consumer cameras and different sensor sizes. However, minor manufacturing defects may cause small variations in quantum efficiency among pixels in the same sensor. While it is a relatively simple matter to identify "stuck" or "dead" pixels that always generate the same response regardless of actual pixel illumination, and correct these sensor defects in-camera, correcting these small variations would be much more complex, and is not generally done. Thus, sensors display a small amount of this static type of noise (as opposed to dynamic noise, which changes stochastically from image to image).

A large number of electrons collecting in an electron well will eventually prevent additional electrons from entering the well. Electrons repel each other, so the more electrons in a limited space, the higher the "pressure" (charge) that must be overcome to store additional electrons. When the charge increases beyond a certain level, the quantum efficiency of the pixel decreases rapidly and non-linearly. At a certain point, no more electrons can be added to the electron well, the brightest portions of the image turn into white or a uniform bright color (often a very pale blue), and no image information can be recovered from them. This phenomenon is called saturation or clipping. In principle, saturation may occur also in the post-processing pipeline, but this is not likely if the camera electronics are properly designed, and ISO, contrast and exposure are properly set. In real-world photography situations, saturation is most likely to happen in the electron wells and in the brightest portions of an image.

The electronics used to read the electrons from an electron well and convert their number to an analog signal first, and subsequently to convert the latter to a digital value, are quite efficient and their noise is low. Typically, most of the stochastic noise accumulates in the electron well, and is caused e.g. by thermal noise (which increases with sensor temperature), electron leaks in the photodiode and well (which increase with exposure time), noise added in the pipeline transporting the electron charges from the wells to an analog to digital converter (ADC), and in the ADC itself. Once the signal is in digital format, it remains essentially immune to further noise.

The discrete number of electrons contained in an electron well is, in itself, a stochastic source of noise. When this number decreases, the error grows bigger. For example, an electron well may contain either 9 or 10 electrons, but not 9.5. Therefore, the signal read from this charge may intrinsically be off by up to ±5%, even in the absence of all other noise sources. In addition, the distribution of photons on the surface of the sensor is not homogeneous but varies stochastically, especially at low photon counts, which increases the total error. Thus, even with a test target completely featureless and illuminated as evenly as possible, adjacent electron wells may contain a significantly different number of photons at the end of the exposure.

According to this article on Nature, a single electron well in a camera sensor collects very roughly 10⁵ electrons in a normally exposed image at base ISO. This number is sufficient to smoothen out the stochastic fluctuations caused by the quantized photon flux. The same article mentions that a recognizable image can be recorded even with less than one photon (on average) per photodiode.

In principle, the difference between a "normally filled" electron well and one that contains a single electron corresponds to 12-13 stops, and the difference between a base sensitivity of 200 ISO and the maximum enhanced sensitivity allowed by the OM-1 (102,400 ISO) corresponds to roughly 9 stops (ignoring non-linearities, which do take place). This means that, at high ISO and low illumination levels, it is entirely possible for individual electron wells to contain a sufficiently low number of electrons to make stochastic noise significant, especially if the image is recorded in raw format (12 or 14 bits per pixel).

A much simplified explanation of how the ISO setting in a digital camera works is that:
- Each pixel collects and converts to electrons almost all the photons it receives during the exposure. At this stage, the sensor does its job largely without knowing what ISO setting is configured in the camera.
- This number of electrons is subsequently multiplied by a transfer function, based on the ISO setting.

Unsurprisingly, high ISO produces a high total noise, and the camera firmware applies a (usually) heavy-handed post-processing to try and hide noise in images shot at very high ISO. This post-processing is a compromise between hiding as much as possible of the noise, while trying to avoid visible artifacts as much as possible. Thus, the noise cannot be completely eliminated.

AI-based noise reduction performed by modern software on a modern PC workstation is allowed to perform extreme amounts of processing, thanks to the powerful CPU and GPU, large RAM, and essentially no time and energy limits. The graphic processors of system cameras, although specialized, are much less powerful and can only perform a simpler processing in the time between other tasks, like storing images to a memory card. The CPU of a mobile phone can do even less processing on its intrinsically higher-noise images within its time, energy and RAM constraints, and therefore allows visible artifacts to appear on a much larger scale than acceptable on a system camera or in PC software.

All the above means that a digital camera with a physically larger sensor, in the conditions set out at the beginning of this section, indeed records a higher number of photons, and therefore an image comparatively less noisy at low illumination and high ISO. In the prevalent, "normal" illumination conditions and low ISO, however, the intrinsic difference caused by sensor size affects image quality only slight among system cameras with sensors of different sizes, and typically not enough to be visually significant.

A two-stop difference in noise is typically less than the difference in image quality introduced by in-camera noise reduction, or by external raw-conversion software.

Subject in macro photography

In macro photography, the photographer usually decides to record a specific field of view, then frames and focuses this field of view before shooting.

A smaller sensor needs to work at a lower magnification to record the same field of view. While a full-frame camera requires shooting at 1x to record a field of view of 24 x 36 mm, a Micro 4/3 camera can do the same at 0.5x (allowing for minor differences caused by the different aspect ratio). Ignoring the effects of lens pupil ratio, when shooting at 0.5x the effective aperture is 1 stop slower than the nominal aperture (i.e. the aperture at infinity focus), while at 1x it is 2 stops slower. Thus, the advantage of the larger sensor reduces from 2 stops at infinity to 1 stop at 1x, which is only a modest advantage.

In summary, the larger sensor performs only slightly better than the smaller sensor in macro photography at the same field of view. It is difficult to see a one-stop difference in noise even in pairs of test images taken in carefully controlled conditions, and almost impossible in real-world images.

Diffraction on sensor pixels?

I remember reading somewhere, as an early criticism against Micro 4/3 cameras, a claim that the pixels on their sensor, being physically smaller than in full-frame cameras, are subjected to a higher extent to diffraction taking place on the sensor, which reduces the actual image resolution. I have been unable to find again the source of this argumentation.

Individual pixels on a 20 Mpixel Micro 4/3 camera are arranged in a square raster with cell sizes of 3.47 μm, which is more than six times the wavelength of green light (0.55 μm). Microscopists know well that the smallest detail resolvable with a conventional optical microscope is half the wavelength of light (0.275 μm in this case). Therefore, diffraction on the sensor is negligible, and the above claim has no basis in reality. It would take a really tiny sensor and/or an extremely high pixel count to run into this type of problem. Even the 80 million photodiodes of the OM System OM-1 are 1.7 μm, and still far from the size that would make diffraction a problem.

Summary

A full frame sensor does indeed collect four times more photons than a Micro 4/3 sensor, at the same exposure and with a subject located at infinity. This gives the full frame sensor a 2 stops advantage. In macro photography, however, the smaller sensor needs to be used at a lower magnification to achieve the same field of view. For example, a Micro 4/3 camera at 0.5x covers the same field of view as a full-frame camera at 1x. This reduces the advantage of the full-frame sensor to 1 stop. There is therefore an advantage in light collection for larger sensors, but it is only a moderate advantage at infinity, and a small advantage in macro photography.