It might be more accurate to say that what diction is fashionable changes rather than a single fashionable diction evolving. There were two major shifts happening almost side by side in the late 19th century that led to this. The formalization of phonetics as a sub-branch of linguistics and the technological development of what would become mass media, initially radio, and later all the other forms of media.
Linguists/philologists like Paul Passy in France and Henry Sweet in England spent the 1880s and 1890s developing tools like the International Phonetic Alphabet and providing early training in phonetics and providing descriptive accounts of languages. Sweet published A Primer of Spoken English, which offered the first full description of pronunciation in a dialect of English, based on the educated speech of London. Incidentally, Sweet was also one of the main inspirations for the character of Henry Higgins in George Bernard Shaw's Pygmalion and the musical My Fair Lady. However, it is clear from his work that while Sweet recognized the prejudices that existed towards different dialects (especially the London Cockney dialect), his intent was not to produce a prescriptive approach towards English pronunciation. Nonetheless, the kind of educated, London dialect that Sweet described would end up spawning two separate forms of acquired diction, as well as a cadre of diction instructors newly trained in the discipline of phonetics.
In the US, this developed as an affected diction that you had to receive explicit instruction in that was a mixture of both American features and especially desirable features of educated London English. This dialect is typically called Mid-Atlantic or sometimes Transatlantic, and it's the dialect that you are asking about above. One of the best examples of it is in the movie The Philadelphia Story, in which both Katharine Hepburn and Cary Grant speak in a Mid-Atlantic dialect that sounds utterly unlike any dialect that exists today. Hepburn was from a well-off American family and received an education from elite schools like Bryn Mawr, while Grant grew up as a destitute vaudevillian from the outskirts of Bristol, performing in London and then touring the US. Hepburn received explicit instruction from a vocal coach while establishing herself as an actress, while Grant's accent developed a bit more naturally as a result of his own life experiences performing and touring in London and then the US at a very young age, likely reinforced by the desirability of the mid-Atlantic accent in the US. The accent itself was developed by an Australian vocal coach and phonetician, William Tilly, at Columbia University in the 1910s and 20s, although he would initially call it World English. Many upper class east coasters would receive explicit and extended vocal instruction as part of their education throughout much of the early to mid 20th century, and its influence would spread due to the prominence of speakers throughout journalism, theatre, films, and politics, aided and abetted by mass media.
In the UK, meanwhile, the kind of standard set by educated London English was developing into a more normative standard often called Received Pronunciation, often abbreviated as RP, and sometimes called BBC English for its association with the BBC early in its history. Its association with educated London English meant that it at least had a geographic location, unlike the more placeless Mid-Atlantic dialect, but it was also often an affected form of English that was taught by vocal coaches, with the expectation for instance that actors would perform in RP. A lot of the stereotypes that people have about how classic British performances of Shakespeare should sound are informed by RP, and in fact when Shakespeare is performed in Original Pronunciation (OP), a subset of audience members can actually be upset because OP doesn't sound as fancy.
Incidentally, you may be wondering about the features of these affected dialects. The single most important feature of both the Mid-Atlantic accent and Received Pronunciation is that they are non-rhotic (rhotic comes from the Greek letter rho). This means that they have a tendency to drop r sounds except for in places where a vowel comes immediately after. It's a common feature in many English dialects across class lines, but for these dialects in particular it gets coded as high status.
So what changed? In the US, non-rhotic dialects were largely limited to the East Coast, and the educational institutions that pushed the Mid-Atlantic dialect were also largely limited to the elite schools of the East Coast. In the wake of World War II, and with a newly awakened sense of the US as a global power in its own right, the center of American English was shifting inland to rhotic dialects. At the same time, the growing population of the West and California's role as the center of the film industry provided a new kind of standard that wasn't rooted in East Coast dialect norms, and in this period we see the development of what is often called General American English or Standard American English, although both terms can be problematic. This is what would be referenced in the film Anchorman when Veronica Corningstone says, "While they're laughing and grab-assing, I'm chasing down leads and practicing my non-regional diction." The idea of General or Standard is that it is a type of American English dialect that isn't readily locatable and that is broadly shared by a number of people across different regions, although in actuality when you get into the nitty gritty of dialectology, this can get much messier.
But even as the fashion has changed, it doesn't mean that there aren't still plenty of dialect coaches out there working with people to acquire a dialect that is perceived as higher status compared to what they speak. The relative status of dialects is quite old. In the medieval play The Second Shepherds' Play, there is a scene in which a character adopts a southern English accent as part of a not especially good disguise in order to convey that he has some connection to royalty. But the tools for training people in diction are very different thanks to the development of phonetics and the capacity to disseminate dialect and establish its prestige was larger in the 20th century thanks to the development of mass media.