SQL Server JOINs executing very slowly with large tables -


below have query takes e-mail 1 table, , joins 3 other tables match e-mail. filters 2 columns (utm_campaign , utm_source) make sure not empty.

two of tables have close million rows, , other 2 around 100,000 rows.

currently, 100 rows outputted takes approximately 60 seconds. i'm expecting between 500,000-1,000,000 rows outputted select statement, might take 4-5 days complete.

i don't understand why server's processors using 27% of resources, or doing differently joins make faster process. have refined joins could, , increased number of processors on server no avail. i'm not familiar indexing , don't know done of data.

has had experience doing joins on such large tables , identify flaws in logic of query, or maybe come more efficient way of matching rows other tables. please see complete query below reference:

select     pu.recip_id,     pu.email,     pu.date_joined,     vp.utm_source vp_source,     vp.utm_med vp_medium,     vp.utm_camp vp_campaign,     vp.created vp_created,     sch.utm_source sch_source,     sch.utm_med sch_medium,     sch.utm_camp sch_campaign,     sch.created sch_created,     gf.utm_source gf_source,     gf.utm_medium gf_medium,     gf.utm_campaign gf_campaign,     gf.created gf_created  [digital].[dbo].[postup_recipients] pu  left join [digital].[dbo].[vp_charges] vp     on pu.email = '"' + vp.email + '"'  left join [digital].[dbo].[stripe_customers] scu     on pu.email = '"' + scu.email + '"'  left join [digital].[dbo].[stripe_charges] sch     on scu.cust_id = sch.cust_id  left join [digital].[dbo].[gform_entries] gf     on pu.email = '"' + gf.email + '"'    (   gf.utm_source not null , gf.utm_source != ''                 , gf.utm_campaign not null , gf.utm_campaign != '')     or         (   vp.utm_source not null , vp.utm_source != ''                 , vp.utm_camp not null , vp.utm_camp != '')     or         (   sch.utm_source not null , sch.utm_source != ''                 , sch.utm_camp not null , sch.utm_camp != '') 

create index on vp.email, scu.email, sch.cust_id, , gf.email.

reverse join logic on 3 joins you're calculating, e.g. pu.email = '"' + vp.email + '"' => vp.email = substring(pu.email, 2, len(pu.email) - 2).

your filters might able played with, gets little tricky. think vp.utm_source not null , vp.utm_source != '' => vp.utm_source > '', , can create index on vp.utm_source, used if there few rows populated. add secondary column index on vp.email. think part lesser of problems. joins above biggest issues.


Comments

Popular posts from this blog

php - Wordpress website dashboard page or post editor content is not showing but front end data is showing properly -

How to get the ip address of VM and use it to configure SSH connection dynamically in Ansible -

javascript - Get parameter of GET request -