Filtering redirects and disambiguation pages

edit

Do you know about the redirect table in the database and the 'disambiguation' pp_propname record in the page_props table? EllenCT (talk) 01:17, 6 July 2016 (UTC)Reply

@EllenCT: While I have not used them, I know they exist. Those field and table are present in our Mysql databases. The pageview data is available in our Hadoop cluster. Having them synchronized on a regular basis is a challenge. This is why we don't correlate those information as of today. --JAllemandou (WMF) (talk) 09:31, 7 July 2016 (UTC)Reply
Are they too big to mirror across? EllenCT (talk) 13:48, 7 July 2016 (UTC)Reply
@EllenCT: Most tables are not too big to be imported from a hadoop perspective. Concern is more about Mysql capacity to handle regular large exports and the cost of automation for 800+ databases. Also milimetric is starting to work on a one off task that involves joining hadoop and Mysql (see T139324). --JAllemandou (WMF) (talk) 09:44, 8 July 2016 (UTC)Reply
Nice! EllenCT (talk) 02:47, 9 July 2016 (UTC)Reply