Interim Conclusions

Whatever will happen with this speaker long term, the experiment is extremely helpful and interesting so far in terms of psycho-acoustic relations. Let's start with...

The drivers and crossover
Once more, the Monacor proves that it is multi talented. Such drivers are a gift to the DIY community with its ever-changing setups and applications. Sorry for the Hifi magazine vocabulary but the bass is tight and otherwise unobtrusive. The woofers are equalized down to 30Hz with a Q of 0.5.
The Vifa is...a real cracker for the money ! Besides its relatively weak motor, which begs for active equalization, the sound is really fantastic. Natural, well balanced and it does also not draw any unnecessary attention. It cannot produce the smallest details with such a finesse like the ScanSpeak drivers but that is what makes the digits on a price tag. The Vifa is a darn good bargain !
The mid range equalization requires sensibility and is therefore still ongoing. Right now the drivers are separated @ 250Hz using a LR4 acoustical crossover. Acoustical because the the Vifa is integrated into a Linkwitz transform followed by a 2nd order Butterworth HP. This was impossible with my previous digital crossover but for the new MiniDSP 2x8 it's a piece of cake.
In this setup the speakers can even play at considerable SP levels without too much stress. So to my surprise they are actually loud enough, which must be a result of the considerable amount of reflections they generate.
Long term, the bass could benefit from a longer throw and even a 10" driver would not be oversized. Peerless' SLS-8 and 10 come to mind as good partners.
The over-all equalization is monotonically falling starting @ 100Hz. Right now it would be down around 5.5 dB at 20KHz if the Vifa reached that far.

First listening impressions
Some people believe that such a polar response is a mess or disaster in small rooms. That is fortunately not true. It works very well for recreational listening. At first I thought the image is a bit fuzzy until I realized that the auditory scene (AS) is just a little bigger than with my other speakers or anything I have auditioned so far. But all proportions are intact including the perception of depth and adaption takes place quickly and easily once the brain recognizes what's going on. It is a bit like switching from a big TV to a cinema screen. As a result, I can understand that some people suggest that omnis are best played in bigger rooms. Our listening area is 4.7m long and 3.9m wide, which is not too small for those speakers and their image size.  Horizontally, the scene stretches across 180° and it seems there are no room boundaries whatsoever whereas pink noise stops at the walls. With all recordings "worked through" so far, the speakers disappear entirely. They play effortlessly, untiring and it is yet again a set that makes you cranking up the volume. Doing so, you will notice that the remote control changes the distance and size of the AS. The louder, the bigger and nearer things get, although both sensations stop at a certain loud level. So the AS will not reach your nose tip and there is no danger that a rock guitar player hits you with the instrument. With this bigger than usual AS there is initially a slight tendency to the "they are here" perception. But "a front row seat in a you are there situation" describes the sensation best. Instruments and singers (and musicians if they move) are well locatable on the virtual stage.
The horizontal sweet spot is not endlessly wide but it can easily host two persons on the couch.
With equalization in place, the sound changes only very little when you stand up or walk passed the speakers. There is virtually nothing like a vertical sweet spot. This was not expected from the vertical polar response.
The highs do not stick to the reflector cones at all, which lets you forget the visually almost scary diffraction effects. Yet, they consume the biggest share of the equalization labor.

Room setup:
190cm from the front wall.
Tubes are 177cm apart and woofers are oriented to the middle.
Distance from triangle base to apex (listener position) is 180cm.
Left tube 110cm from the left side wall.
Right tube 103cm from the right wall.

Interim Conclusions:
Again, this speaker is a great psycho-acoustic experience and it does come with a set of sensations. This "big sound (AS) from small speakers" is, however, quite unexpected initially. But since the auditory scene has grown proportionally I would not call these sensations spatial distortion as the AS in its entirety remains believable and plausible.

 

Update 09-Oct-2011:
I directly compared the AS to Aristoteles’ now. Actually, the only directions that contribute more are the sides and the depth. This creates the impression of getting a little closer to the stage. But as mentioned, the “picture” is not distorted.
Anyway, the longer I listen to them, the more fun they are.

 

Update 28-Jan-2012:
Somehow Demokrit reminds me in London's Battersea power station, which is on the cover of Pink Floyd's Animals album. One of their greatest albums ever. I have already listened to it on my first DIY equipment with the redone Dual speakers and some TDA2030 "power" amps. Their thermal shutdown circuitry always trapped during "Pigs" at 4'42" / 43" in the song. My mind still somewhat expects this short interruption today. Anyways...

...the speaker will stay, that is definitive !

It was so haunting that I have listened to it ever since the last update !
Even today I am amazed by the little Vifa unit, every day and every minute. It can take up and re-radiate so much energy. Such a driver for 15 bucks...unbelievable. I will use it again as soon as as another meaningful application comes up.

The impression about the AS has not changed long term and it remains believable. Although the phrase "auditory scene" is a bit understated: This speaker creates something more like a scenery, a landscape or panorama in width and depth.

A 10" woofer is not required (although it would not hurt either). A long throw 8" such as the SLS-8 would be sufficient (it might be jumping heavily, though when used during home cinema sessions).

Meanwhile I found the "gotcha" with the agitated vertical response: The reflector must not slant from its intended position/orientation. Otherwise the phantom imaging is very noticeably affected. I had observed that coincidentally after moving the speakers around.

Aside from this particular speaker, on some recordings the voice of the artist is not rendered in the middle of the transducers. When that was observed with Demokrit, I verified the recordings with male and female voices with headphones on completely different equipment to rule out tolerances in the gain structure of my pre-amp, power amps and x-over. So far I have not found any deviation. If the voice is off center with the speakers it is off with headphones as well.
I am mentioning all that because with such a speaker the room is maximally engaged. As a result, the slightest asymmetry of the room might become detectable. My room is not fully symmetrical and thus suspicious. The right speaker is 7 cm closer to the wall and the left wall does not stretch all the way to the back wall. Instead, it opens up into the dining area. All this, however, shows that Demokrit is fairly immune to asymmetries, what is not intuitive.

I am very critical with my designs. But so far I haven't found anything wrong with this speaker. The imaging is great and the auditory scene is as described previously. For me the audio myth that omnis do not image well, has clearly been busted with this experiment !

 

Update Mar-12-2012
The Monacor woofers have now been replaced by Peerless SLS-10 in order to create a more solid bass fundament. The SLS-10 is, despite its old age, ideal for this application and they play relaxed and effortlessly with enough reserves. The CB has about 21L with the driver mounted and they yield the following parameters:

Fs (F0)

61.3 Hz

Re

5.6 W

Le

1.7 mH

Qmc

8.68

Qec

1.21

Qtc (Q0)

1.06

Fs (F0)

60.8 Hz

Re

5.6 W

Le

1.7 mH

Qmc

8.80

Qec

1.22

Qtc (Q0)

1.07

The equalization target of 30Hz and a Qt of 0.5 was not changed and the speaker's soul is still the same.
I could have used the 835016 but I didn't want to rip Aristoteles apart and the cost of those drivers do not match the rest of Demokrit. Another higher cost alternative could be the Wavecore SW270WA01. It is from China with Danish roots, though.

Update Mar-29-2012

Equalization

On the intro page of Demokrit I have stated that the speaker would sound too bright if the frequency response is equalized to a horizontally flat transfer function and I have verified that this is case.
The following picture shows the initial free field response that allows the speaker to sound natural and pleasing in a typical small and relatively live sounding room. I have used this equalization since the beginning. It needs to be mentioned, though that I started the equalization process with no scientific target function and it was my original intention to implement a monotonically falling response.

The reason as to why this speaker would not sound correct if equalized to flat depends mainly on two things:

  1. How we perceive sound in a reverberant environment and
     
  2. the radiation pattern of the speakers or more precisely, sound power response.

The sound power response curve is a good approximation for the total sound released into the room and that arrives at the listening position as direct sound and as reflections.
The curve is computed by averaging and weighing all frequency responses of the speaker recorded at all angles (in 10° steps) in the horizontal and vertical plane. You could also look at it as the total acoustic energy radiated through the surfaces of a sphere where the distance of the microphone is the radius.
The real unit for sound power is Watt but in speaker design it is more often used as a level and the curve actually looks like a FR curve [1].
For a point source the power response would be identical to the FR as the point source radiates equally into every direction. With a real speaker that is not the case and typically the sound power starts to fall the latest when the tweeter begins to beam, which creates a typical forward bias.

Demokrit is an acoustically small omnidirectional speaker and thus the sound power response closely follows the on-axis FR. As a result this speaker creates a maximum amount of reflected sound at the listening position.
Arta is able to do the averaging part and the resulting picture is shown below. Only responses from 0°…180° horizontally and vertically have been included. So they represent one vertical half of the imaginary sphere. But the missing half is identical and thus the resolution would not have been increased further by adding more measurements. In addition, the curve was calculated with the speaker equalized to flat. But it should be easy for the reader to mentally re-apply the used equalization as shown above as the response of Demokrit is visually clearly dominated by the “on-axis” curve.

 

The difference of the two responses above (FR and sound power) is called the directivity index (DI). Actually, this is the old definition. The new one is the difference between the listening window and the power response. But in case of Demokrit this does not make any difference as can been seen from the listening window curve.
In theory the DI (right hand picture) is a flat line at 0dB.

Perception:
The following picture describes our perception of pure tones in a diffuse field and at the same time it compares this with the level how loud we would perceive the same signals in a free field condition (without any reflected sound). A room has a more or less diffuse sound field above the Schroeder frequency or the "transition zone" as Toole calls it since this is no sharp changeover.

As can be seen, the reflected sound has substantial influence upon the perceived loudness of pure tones in the region of 1KHz and between 4KHz and 14KHz. Around 2.5KHz, you would actually need to turn the signal up in a room compared to the same tones played back e.g. in the garden, which is a curiosity in itself. You could also say that in this frequency band we are more sensitive to the direct sound from the speaker than we are to the reflected portion.
So, if you place a speaker, which exhibits a wider than typical dispersion and thus having a relatively flat sound power response into a room, the resulting sound will be too bright and somehow incorrect in the mid range. Demokrit, when eq’ed to flat, drives this effect to the maximum. In order to re-balance the sound back to “normal” this rather drastic roll-off in the mids and highs is required.
If we now compare the speaker’s transfer function, which was found empirically, with the picture about loudness perception, a curious similarity becomes obvious.

This is how I have been listening to the speaker since the beginning and I was quite satisfied. Should that be pure coincidence ?
The red region is inherent to the design. I did not do anything with it and until I started writing all this up, it did not even occur to me. The same is true for the border between green and yellow. The frequency range that I manipulated is between 1KHz and 3KHz in order to remove the cone diffraction and between 5KHz and 10KHz in order to bring the output a bit up again so that the response would roughly follow a monotonically falling function.

Having found this basic agreement, I started playing with the frequency band around 2.5KHz. In order to increase the output there, a simple peak (inverse notch) was introduced with a mid frequency of 2.5KHz and a Q of 3.5. Starting even at a level of +0,5dB the following effects crept up:

  • There was more freshness and clearness.
     
  • Switching back and forth between the increase and the previous EQ I noticed that the original sound was actually a small pinch too dull. It was not wrong at all but now the speaker really sounds more shiny (no, I am not saying the veil sentence J . And I actually have to take back what I said about the Vifa and detail finesse. It has more brilliance now.
     
  • Also very interesting is that when the peak is on, the center image is positively more focused and better defined if that frequency range is excited by the program material (with female voices, applause on live recordings, snares, strings, harps, hi-hats and such).
    The increased lateral output is not noticeably affected at all by the peak. No image broadening no nothing. Everything only happens between the speakers !
     
  • The denser the spectrum of the music (e.g. more different instruments + voices + reverb), the less obvious is the change between peak on and off.
     
  • The effects are proportional to the volume setting so the impressions do not change no matter if you hear loud or soft.

So the general impressions with an increased radiation in the frequency range between 2KHz and 4KHz  can be summarized as more brilliance and focus of the center image. Latter must be due to the change of the direct sound portion.

After some playing time I changed the mid frequency to 2.8KHz in order to better match the relative energy distribution of loudness perception curve. And the result sounded even more convincing, especially with pink noise.
+1.5dB seems to be the minimum required. Then I got daring and cranked the level up, +10dB and more. That certainly is too much. Program material depending the optimum lies somewhere between +2dB and +4dB and it only becomes annoying at +6dB and greater. And that despite the fact that the direct and reflected sound are altered at the same time and not only the reflections, what would be ideal for this experiment !
The following diagram shows the peak with +3dB that I have arrived at after longer listening tests.

Compared to the change in the 3KHz region, the effect of flattening out the red 1KHz sag is pretty subtle. For that I used a +2dB peak @1KHz with a Q of 2.5. It doesn’t seem to do anything really with music and even with pink noise it is hardly detectable. Therefore, I cannot really describe how the tonality changes. But if the sag is removed it is as if the extra energy eats up dynamic range of the ear and as a result you don’t want to make the music as loud as when it is in place.

With the 13KHz peak there is a story associated. Actually, I wanted to remove it initially. But exactly the transition from green to yellow and this peak itself is influenced by the cone position and sometimes during measurements it is not captured properly. I am not sure why that is but it might be related to wind noise. So instead of removing it, I only attenuated it by -2.3dB. When I completely remove the peak now (no yellow area) based on the current measurement, the sound is clearly too dull. The right hand picture shows the now unattenuated peak. The 1/3 octave smoothing actually bites it a bit off. In reality, it is higher.

During the course of these tests I additionally removed the down shelve (100Hz…20KHz, -1.5dB), which is best visible in the very first diagram. This brings the peak further up by another +1dB or so. But after that I had to re-balance the the upper midrange by bringing the green area a bit further down between 6KHz and 10KHz and I finally think it sounds better. These last changes are not available as picture right now.

The bottom line is: all the positive sounding elements found during this re-equalization process make the transfer function of Demokrit more similar to the loudness perception diagram and it was verified that the changes were indeed longer term improvements.
Certainly, the last word has not been spoken about the final equalization of this particular speaker, especially not the transition between the green and yellow areas and the 13KHz peak itself.
But the loudness function is much more applicable to real life speaker design than I initially had thought !

 

So, all this seems to allow for a set of conclusions:

  1. The wider a speaker disperses the sound (the flatter the sound power response or the more reflections is generates), the closer its “on-axis” FR has to follow this or a similar curve. Otherwise the tonality of the speaker does not sound correct when played in a typical domestic room.
    This even seems to hold true for music, although the loudness function was recorded using narrow band noise (not steady state sinusoidal signals as one could think from the picture description and as stated previously).
    Also, dipolar speakers like Aristoteles cannot have a flat FR as they can be considered wide dispersion designs due to the reflections they create. The DSS filter introduced by SL for Orion 3 already provided very similar clues [3] and it does have universal aspects as mentioned in the reference.
     
  2. There cannot be any such general requirement like a ruler flat on-axis frequency response from 20Hz...20KHz that is applicable to every speaker.
    It all depends on the radiation pattern and hence the total energy at each frequency released into and reflected within the room (at least above the Schroeder frequency/transition zone).
    There is, of course, also a dependency on the size of the listening room, absorption coefficients of surfaces and structures, and the listening distance to the speakers.
    The greater the directionality of a speaker, the greater can be, and I would even say should be, the listening distance (,which makes a bigger room as well) or the less we hear from the room. An omni speaker always creates a smaller ratio of direct to reverberated sound. As a result you have to sit closer to them otherwise you hear too much of the room/the reflected sound.
    Lipshitz and Vanderkooy [2] were concerned in 1985 that “rolled off highs” under “semi-anechoic conditions” would lead to a “falsified direct sound” and the speaker would be “lacking highs”.
    I am not sure about semi-anechoic conditions J but in my living room there is no such issue. I think the listener simply has to sit within the “design vicinity” of the speaker. And that is certainly strongly related to the speaker’s reverberation distance. Based on his experience, SL reports that the listening distance should not be greater than two times the reverberation distance of any design and to me with my own experience that sounds like a very reasonable value.
    I am sitting 1.8m away from Demokrit, which must have a reverberation distance of < 1m in my room and I can hear that the listening distance is at its limit. During critical listening I find myself sometimes subconsciously leaning forward by about 50cm.
     
  3. If the design of a speaker must, for whatever reason, exhibit a jump in directivity because e.g. you cross to the next smaller driver while the bigger one has already started to beam, try to put it into the range between 2KHz and 4KHz. Based on these findings, the accompanying excess of radiated off-axis energy seems to be no or at least less of an issue there.
     
  4. A regular consumer speaker may very well exhibit a rather flat on axis response because normally dome tweeters (0.75" and 1") start to beam in the 4KHz...5KHz range and hence no extra attenuation is required for a balanced sound indoors as the sound power decreases and the directivity index rises. This can also happen in the mid range.
     
  5. It is even more likely that narrow dispersion constant directivity speakers generally benefit from a flat on-axis response because they produce only little reflections and their sound power response is far from flat (probably the 2nd worst in this respect I can imagine).
    I have seen many designs with typical full range drivers (4"...8"), where the lacking off-axis radiation in the highs due to early beaming of such units was compensated with even a rising on-axis FR. That is, with all due respect, the worst sound power response I can imagine. I have a hard time to believe that they sound anywhere near balanced when set up free standing in a room of regular size. The highs must be too heavy (and I’m intentionally not saying too bright). Also, such speakers must fail miserably to perform the desired hiding trick because the extra direct energy draws too much attention to the physical sources. So the speakers can be localized too easily and the brain cannot concentrate on the phantom image. This is actually even true for typical box speakers with regular dome tweeters, which normally have already a very hard time to hide.
    It can be said that sound power in general is not only about tonality/timbre as often assumed.

Summary:
How I came to the findings felt a bit like Alice in speakerland. Nonetheless, the Demokrit experiment shows a very interesting strong agreement with the cited diagram in a typical small listening environment. The room’s reflection/energy decay properties have neither been further quantified nor taken into account. My room is not special in any way. Even so, the correlation is there and as a result, the loudness function is demonstrably a meaningful target function for equalizing omni polar and even other wide dispersion speakers. However, the narrower the dispersion or the less the room is excited, the more it needs proper interpretation of and extrapolation from the exact loudness response.
My satisfaction with the initial equalization also shows once more the mightiness of this freaking adaption phenomenon because I was already simply used to the sound. It has to be mentioned, though that the new equalization has in no way changed the soul of the speaker. The strongest long term change is the improved focus and brilliance in the center image. But now I can draw even more joy from the speaker although I plan to further optimize the transfer function. I found that this process is iterative as the changes in the blue, green and yellow sections interact with each other perceptually, whereas the red region has been found to be an isolated phenomenon so far. After re-doing the yellow and green spots, the blue one is back to a mid frequency of 2.5KHz and even a lower Q of 3 (more blue, less green, more direct energy in total) and it does not seem to make any difference anymore, which is also very interesting.

P.S. Speaker equalization of such magnitude should not solely be done based on 3rd octave smoothed measurements. Higher resolution is required. I only use the smoother graphs here because they are easier to interpret.

References:
[1] Floyd E. Toole, Sound Reproduction, The Acoustics and Psychoacoustics Of Loudspeakers And Rooms, first edition, 2008, page 379

[2] Stanley P. Lipshitz, John Vanderkooy, Experiments In Direct / Reverberant Ratio Modification,
Presented at the 79th Convention in New York, October 12-16, 1985, Page 3 top


[3] A2 - High-frequency down-shelving for ORION-3

 

Addendum 06-Apr-2012:
The 2.5KHz peak has meanwhile become instrumental to this speaker. It creates a much better balance and focus in the phantom image between the speakers. It acts as if the directivity of the speaker has increased a bit and it might be a feature that is missing in many other "full omni" designs.

I would also like to emphasize that the equalization method based on this loudness function has primarily nothing to do with the "loudness switch" that can be found in typical consumer amplifiers and that is even adjustable in Yamaha gear. That "loudness switch" is more related to the equal loudness contours (Fletcher–Munson curves) in a way that it balances the perceived sound when we hear loud vs. low volume signals. So this switch is solely meant to increase the bass level while the volume is turned down and sometimes it increases the highs as well, which is actually not required and hence incorrect.

Last updated 16-Apr-2012