----- Discrete versus continuous spectrum
The energy (frequency) of an emitted (or absorbed) photon matches the energy change of the emitted particle, often an electron. In small molecules composing a gas, the possible energy levels are well separated and define a set of possible photon energies. But in a solid metal for instance, electrons have many possible energy levels very close to an other, because many electrons share a common big place, so the photon spectrum is nearly continuous
Many solids can emit and absorb light at discrete frequencies too, for instance Yag crystals used as a laser. Yag contain "colour centres" made of doping atoms or ions where one electron stays locally. This electron has well-spaced energy levels because it's local, and its spectrum is discrete. A better chosen similar solid would emit light at discrete lines when hot, provided its bulk is transparent enough so its continuous emission doesn't hide the lines.
Also, it takes time to define a frequency or energy. The duration of the unhindered absorption or emission by the particle is limited, for instance by collisions that affect the electron's state, the Döppler effect... As a result, transition frequencies get wider as a gas' pressure increases and collisions become more frequent. For instance, 1atm suffices to blur completely the 2.45GHz line of water vapour (made worse by the frequency being lower than visible light), and in liquid water it's not observable, so "water resonance" in microwave ovens is irrelevant. The Döppler effect too blurs the lines as the emitter's speed relative to the observer shifts the frequency and many emitters superimpose.
So if 1023 lines weren't continuous enough, we could also tell that when the blurring is wider than the line spacing, a spectrum is continuous.
As a first note, a magnetic field doesn't really attract a pole. It's the gradient of the magnetic field that attracts or repels the dipole, depending on their relative orientations. "Attract a pole" suffices in usual life because magnet ends create field gradients. And in Stern-Gerlach, the magnets are designed to create field gradients; a uniform field would have no effect on the atom's path (where the field is uniform... It must end somewhere).
Depending on whether the electron's magnetic field is parallel or antiparallel to the gradient, it's attracted or repelled (put signs if you like) by the sharper pole shoe where the field is stronger. You might have expected the electron to take spontaneously its orientation of lower energy, as we are used to with compasses, but this needs to release energy; since the process of spin flipping isn't very efficient at radiating energy, it takes time - longer than the atom's flight in Stern-Gerlach.
By the way, this time is longer for nuclei than electrons because they're heavier. MRI devices observe, measure and map this time with more refinements.
The revolutionary part of Stern-Gerlach is that an electron (and the atoms that contain it) can take only two "orientations" in the magnetic gradient: parallel or antiparallel. I ignore the reason, sorry. It relates with the electron's angular momentum, and with being a fermion.
If you observe many particles having parallel or antiparallel orientation, for instance in an MRI device, the statistical result (it's only like 1ppm over 50/50) is a vectorial magnetic field. This field behaves as if individual electrons or nuclei were small charged gyroscopes (but mind the Landé factor), with a torque exerted by the external field, a gyroscopic stiffness, a precession angle and frequency and so on.
And even, if you wanted to define an electron's orientation by a definite spin vector, you could compute the precession of this vector, and so on, and deduce at the end the chances to observe the electron parallel or antiparallel.
- BUT -
This vector would be a "hidden variable": something that would exist but is not observable, or not observed within our current technology. Though, we know from experiments (not from theory, and this was debated long ago) that hidden variables don't exist, especially not for the spin.
So the proper formalism uses no such spin vector. Instead, it takes for the spin state the chances to observe the orientation parallel vs antiparallel over the three axes, and gives matrices to compute how the chances evolve as an external field acts.
Hum... Expect many people to use mentally the wrong but comfortable representation of a spin vector for individual particles, and convert to probabilities at the last moment and in public.