{"id":955,"date":"2018-10-04T08:39:24","date_gmt":"2018-10-04T06:39:24","guid":{"rendered":"http:\/\/muzeum.netfolio.hu\/?p=955"},"modified":"2025-06-05T13:21:18","modified_gmt":"2025-06-05T11:21:18","slug":"befejezodott-20-evfolyamnyi-arjegyzolap-digitalizalasa","status":"publish","type":"post","link":"https:\/\/tozsdemuzeum.hu\/en\/2018\/10\/04\/befejezodott-20-evfolyamnyi-arjegyzolap-digitalizalasa\/","title":{"rendered":"Digitisation of 20 vintages of Price Quotation Journals completed"},"content":{"rendered":"<p>We are pleased to announce that, after almost two years of work, the digitisation of the Budapest Commodity and Stock Exchange Price Quotation Journals, published between 1894 and 1913, was completed in October 2018.<\/p>\n<h2>The beginnings<\/h2>\n<p>The idea to process old stock market data first occurred to M\u00e1rton Radnai, CEO of Ramasoft Data Services and Information Technology Ltd., almost 10 years ago when he learned that past US stock market data is now available on a daily basis for almost 200 years. It was then that the idea of processing this data for Hungary was conceived. But at the time, neither market nor public funding for the project could be found.<\/p>\n<p>However, over the past 10 years, the cost of digitising old journals has fallen significantly as <a href=\"http:\/\/www.arcanum.hu\/\">Arcanum Database Ltd.<\/a> digitised one of the stock market price sources, the <a href=\"https:\/\/adtplus.arcanum.hu\/hu\/collection\/BudapestiKozlony\/\">Budapest Public Notices<\/a> and made available at its own expense the <a href=\"https:\/\/adtplus.arcanum.hu\/hu\/\">Arcanum Digital Library<\/a>. So two years ago, M\u00e1rton Radnai decided to start the project with his own funding.<\/p>\n<h2>Find resources<\/h2>\n<p>The project started with a search for available data sources. Until the First World War, the prices of the old stock exchange were published in three journals: the Budapest Price and Stock Exchange Price Quotation Journal, published almost uninterruptedly between 1864 and 1948, the Budapest Gazette (the official state gazette of the time) and Pester Lloyd. Of the three publications, the Budapest Gazette provided only partial data (no prices for deals, only closing prices) and Pester Lloyd was in German. After the First World War, the data were published for a while only in the Quotation Journal, and then again in the Budapest Gazette, but only in a highly abstracted form. It became clear that the optimal solution would be to digitise the Quotation Journal.<\/p>\n<figure id=\"attachment_771\" aria-describedby=\"caption-attachment-771\" style=\"width: 400px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-771\" src=\"http:\/\/www.tozsdemuzeum.hu\/wp-content\/uploads\/2018\/09\/20170503_171302-e1538211667627-225x300.jpg\" alt=\"\" width=\"400\" height=\"533\" srcset=\"https:\/\/tozsdemuzeum.hu\/wp-content\/uploads\/2018\/09\/20170503_171302-e1538211667627-225x300.jpg 225w, https:\/\/tozsdemuzeum.hu\/wp-content\/uploads\/2018\/09\/20170503_171302-e1538211667627-768x1024.jpg 768w\" sizes=\"auto, (max-width: 400px) 100vw, 400px\" \/><figcaption id=\"caption-attachment-771\" class=\"wp-caption-text\">Some volumes of the price lists are still in their original binding.<\/figcaption><\/figure>\n<p>The number of copies of the Quotation Journals was not very high, so very few of them have survived, only a few Hungarian libraries had them, and some volumes were available in the Austrian national library. Of these <a href=\"http:\/\/www.fszek.hu\/\">Szab\u00f3 Ervin Library of Budapest<\/a>\u00a0The Budapest Collection (where the years 1873-1913 were available) has agreed to make its collection available for digitisation. The digitisation was carried out by Arcanum Database Ltd. Negotiations between the three partners were finalised by April 2017, and digitisation began.<\/p>\n<h2>Photography<\/h2>\n<p>Since the printing technology for the 1873 to 1893 years had not yet have the quality that would have allowed for later computer processing of the data, and since the data content of the price lists for this period was essentially identical to the already digitised Budapest Gazette, the project sponsor Ramasoft decided to digitise the 1894 to 1913 period in order to be cost-effective.<\/p>\n<figure id=\"attachment_765\" aria-describedby=\"caption-attachment-765\" style=\"width: 500px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-765\" src=\"http:\/\/www.tozsdemuzeum.hu\/wp-content\/uploads\/2018\/09\/20170503_170330-300x225.jpg\" alt=\"\" width=\"500\" height=\"375\" srcset=\"https:\/\/tozsdemuzeum.hu\/wp-content\/uploads\/2018\/09\/20170503_170330-300x225.jpg 300w, https:\/\/tozsdemuzeum.hu\/wp-content\/uploads\/2018\/09\/20170503_170330-768x576.jpg 768w, https:\/\/tozsdemuzeum.hu\/wp-content\/uploads\/2018\/09\/20170503_170330-1024x768.jpg 1024w\" sizes=\"auto, (max-width: 500px) 100vw, 500px\" \/><figcaption id=\"caption-attachment-765\" class=\"wp-caption-text\">The first step was to remove the previous bandage, which was done with this cutting machine.<\/figcaption><\/figure>\n<p>This was done using two techniques: the 1894-1904 cohorts were photographed using a photographic machine. However, for the later years (as the price sheet doubled in size), the images were captured using a so-called map scanner.<\/p>\n<figure id=\"attachment_767\" aria-describedby=\"caption-attachment-767\" style=\"width: 400px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-767\" src=\"http:\/\/www.tozsdemuzeum.hu\/wp-content\/uploads\/2018\/09\/20170503_171135-e1538211610126-768x1024.jpg\" alt=\"\" width=\"400\" height=\"533\" srcset=\"https:\/\/tozsdemuzeum.hu\/wp-content\/uploads\/2018\/09\/20170503_171135-e1538211610126-768x1024.jpg 768w, https:\/\/tozsdemuzeum.hu\/wp-content\/uploads\/2018\/09\/20170503_171135-e1538211610126-225x300.jpg 225w\" sizes=\"auto, (max-width: 400px) 100vw, 400px\" \/><figcaption id=\"caption-attachment-767\" class=\"wp-caption-text\">The lighting was then installed.<\/figcaption><\/figure>\n<h2>Pre-processing of images<\/h2>\n<p>After the exposure, the images were handed over to Ramasoft by Arcanum. In order to improve the quality of the subsequent optical character recognition, Ramasoft developed a special software that performed two functions: firstly, it automatically straightened and trapezoidally corrected the price tables, so that the rows of the table were only horizontal, which is a prerequisite for efficient character recognition. On the other hand, he transformed the tables to the same pixel position every day in order to set a template in the optical recognition software for processing them.<\/p>\n<figure id=\"attachment_989\" aria-describedby=\"caption-attachment-989\" style=\"width: 538px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-989 size-full\" src=\"http:\/\/www.tozsdemuzeum.hu\/wp-content\/uploads\/2018\/10\/Scaneredeti.jpg\" alt=\"\" width=\"538\" height=\"246\" srcset=\"https:\/\/tozsdemuzeum.hu\/wp-content\/uploads\/2018\/10\/Scaneredeti.jpg 538w, https:\/\/tozsdemuzeum.hu\/wp-content\/uploads\/2018\/10\/Scaneredeti-300x137.jpg 300w\" sizes=\"auto, (max-width: 538px) 100vw, 538px\" \/><figcaption id=\"caption-attachment-989\" class=\"wp-caption-text\">The original image<\/figcaption><\/figure>\n<p>However, the character recognition result was not yet satisfactory, as the recognition software could not separate the rows of the tables. Therefore, in a second step, the dividing lines of the tables were drawn. The tables could then be recognised with sufficient quality.<\/p>\n<figure id=\"attachment_991\" aria-describedby=\"caption-attachment-991\" style=\"width: 538px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-991 size-full\" src=\"http:\/\/www.tozsdemuzeum.hu\/wp-content\/uploads\/2018\/10\/Scankorrig\u00e1lt.jpg\" alt=\"\" width=\"538\" height=\"246\" srcset=\"https:\/\/tozsdemuzeum.hu\/wp-content\/uploads\/2018\/10\/Scankorrig\u00e1lt.jpg 538w, https:\/\/tozsdemuzeum.hu\/wp-content\/uploads\/2018\/10\/Scankorrig\u00e1lt-300x137.jpg 300w\" sizes=\"auto, (max-width: 538px) 100vw, 538px\" \/><figcaption id=\"caption-attachment-991\" class=\"wp-caption-text\">The corrected image<\/figcaption><\/figure>\n<h2>Optical character recognition<\/h2>\n<p>For optical character recognition, Ramasoft used the Abbyy Finereader application. This was done in two phases: in the first phase, the straightened and trapezoid-corrected images were converted into a so-called two-layer PDF, which allows them to be read later.<\/p>\n<p>In the second phase, the version including the drawn table lines was character-recognised and the data exported to Microsoft Excel.<\/p>\n<figure id=\"attachment_1003\" aria-describedby=\"caption-attachment-1003\" style=\"width: 606px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-1003 size-full\" src=\"http:\/\/www.tozsdemuzeum.hu\/wp-content\/uploads\/2018\/10\/Korrig\u00e1ltOCR.jpg\" alt=\"\" width=\"606\" height=\"277\" srcset=\"https:\/\/tozsdemuzeum.hu\/wp-content\/uploads\/2018\/10\/Korrig\u00e1ltOCR.jpg 606w, https:\/\/tozsdemuzeum.hu\/wp-content\/uploads\/2018\/10\/Korrig\u00e1ltOCR-300x137.jpg 300w\" sizes=\"auto, (max-width: 606px) 100vw, 606px\" \/><figcaption id=\"caption-attachment-1003\" class=\"wp-caption-text\">Recognising the corrected image in OCR software<\/figcaption><\/figure>\n<p>&nbsp;<\/p>\n<figure id=\"attachment_999\" aria-describedby=\"caption-attachment-999\" style=\"width: 760px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-999 size-full\" src=\"http:\/\/www.tozsdemuzeum.hu\/wp-content\/uploads\/2018\/10\/Excelnyers.jpg\" alt=\"\" width=\"760\" height=\"272\" srcset=\"https:\/\/tozsdemuzeum.hu\/wp-content\/uploads\/2018\/10\/Excelnyers.jpg 760w, https:\/\/tozsdemuzeum.hu\/wp-content\/uploads\/2018\/10\/Excelnyers-300x107.jpg 300w\" sizes=\"auto, (max-width: 760px) 100vw, 760px\" \/><figcaption id=\"caption-attachment-999\" class=\"wp-caption-text\">Raw data exported to Excel<\/figcaption><\/figure>\n<h2>Data verification and correction<\/h2>\n<p>The raw excel output was then sorted into a database. To do this, we built a securities data list and a dictionary that matched the names of the securities in the newspaper with the names in the data list. A major difficulty was that in many cases a catcount was used to name securities, which made this matching difficult. In addition, the exchange rate data had to be cleaned and formatted (for example, the quotations of the time did not have decimal separators).<\/p>\n<p>From the data thus paired and cleaned, another excel file was created, in which on the one hand automatic control rules were checked, and on the other hand this file was given to the proofreaders to compare with the original documents and to correct remaining errors manually.<\/p>\n<figure id=\"attachment_997\" aria-describedby=\"caption-attachment-997\" style=\"width: 760px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-997 size-full\" src=\"http:\/\/www.tozsdemuzeum.hu\/wp-content\/uploads\/2018\/10\/Excelfeldolgozott.jpg\" alt=\"\" width=\"760\" height=\"264\" srcset=\"https:\/\/tozsdemuzeum.hu\/wp-content\/uploads\/2018\/10\/Excelfeldolgozott.jpg 760w, https:\/\/tozsdemuzeum.hu\/wp-content\/uploads\/2018\/10\/Excelfeldolgozott-300x104.jpg 300w\" sizes=\"auto, (max-width: 760px) 100vw, 760px\" \/><figcaption id=\"caption-attachment-997\" class=\"wp-caption-text\">Paired and corrected data in Excel<\/figcaption><\/figure>\n<h2>Exporting data to a database<\/h2>\n<p>The last step was to export the corrected data to an SQL database.<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>","protected":false},"excerpt":{"rendered":"<p>\u00d6r\u00f6mmel jelentj\u00fck be, hogy csaknem k\u00e9t \u00e9vi munka ut\u00e1n 2018 okt\u00f3ber\u00e9ben befejez\u0151d\u00f6tt az 1894 \u00e9s 1913 k\u00f6z\u00f6tt megjelent Budapesti \u00e1ru- \u00e9s \u00c9rt\u00e9kt\u0151zsde \u00c1rjegyz\u0151 Lapjainak a digitaliz\u00e1l\u00e1sa. A kezdetek Az r\u00e9gi t\u0151zsdei adatok feldolgoz\u00e1s\u00e1nak \u00f6tlete el\u0151sz\u00f6r csaknem 10 \u00e9ve mer\u00fclt fel Radnai M\u00e1rtonban a Ramasoft Adatszolg\u00e1ltat\u00f3 \u00e9s Informatikai Zrt. vez\u00e9rigazgat\u00f3j\u00e1ban, amikor megtudta, hogy a m\u00faltbeli amerikai [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":771,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-955","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-egyeb"],"acf":[],"_links":{"self":[{"href":"https:\/\/tozsdemuzeum.hu\/en\/wp-json\/wp\/v2\/posts\/955","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/tozsdemuzeum.hu\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/tozsdemuzeum.hu\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/tozsdemuzeum.hu\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/tozsdemuzeum.hu\/en\/wp-json\/wp\/v2\/comments?post=955"}],"version-history":[{"count":16,"href":"https:\/\/tozsdemuzeum.hu\/en\/wp-json\/wp\/v2\/posts\/955\/revisions"}],"predecessor-version":[{"id":12079,"href":"https:\/\/tozsdemuzeum.hu\/en\/wp-json\/wp\/v2\/posts\/955\/revisions\/12079"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/tozsdemuzeum.hu\/en\/wp-json\/wp\/v2\/media\/771"}],"wp:attachment":[{"href":"https:\/\/tozsdemuzeum.hu\/en\/wp-json\/wp\/v2\/media?parent=955"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/tozsdemuzeum.hu\/en\/wp-json\/wp\/v2\/categories?post=955"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/tozsdemuzeum.hu\/en\/wp-json\/wp\/v2\/tags?post=955"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}