{"id":11451,"date":"2026-03-25T11:42:27","date_gmt":"2026-03-25T06:12:27","guid":{"rendered":"https:\/\/asia.wordcamp.org\/2026\/?post_type=wcb_session&#038;p=11451"},"modified":"2026-04-11T04:24:21","modified_gmt":"2026-04-10T22:54:21","slug":"parsing-html-without-pain-real-world-use-cases-for-wordpress-html-api","status":"publish","type":"wcb_session","link":"https:\/\/asia.wordcamp.org\/2026\/session\/parsing-html-without-pain-real-world-use-cases-for-wordpress-html-api\/","title":{"rendered":"Parsing HTML Without Pain: Real-World Use Cases for WordPress HTML API"},"content":{"rendered":"\n<p>By the end of this session, attendees will:<\/p>\n\n\n\n<p>-&gt; Understand when and why to use the WordPress HTML API instead of traditional methods like regex, str_replace(), or DOMDocument, including specific security vulnerabilities, HTML5 incompatibility issues, and performance problems each legacy approach creates.<\/p>\n\n\n\n<p>\u2028-&gt; Master WP_HTML_Tag_Processor fundamentals for memory-efficient, single-pass HTML parsing: core methods (next_tag, get\/set_attribute, add\/remove_class), the bookmark system for complex document traversal, and when streaming parsing is sufficient for your needs.\u2028<\/p>\n\n\n\n<p>-&gt; Utilize WP_HTML_Processor for structure-aware operations: navigate HTML hierarchically using breadcrumbs, track nesting depth, properly match CSS classes, and handle malformed HTML gracefully with built-in error detection.\u2028<\/p>\n\n\n\n<p>-&gt; Apply real-world use cases beyond block customization: safely sanitize user-generated content, add performance attributes (lazy loading, fetchpriority, decoding) to any HTML source, modify link attributes programmatically, process shortcode or widget output, and enhance accessibility with ARIA attributes.\u2028<\/p>\n\n\n\n<p>-&gt; Navigate the API&#8217;s evolution across WordPress 6.2 through 6.7: understand capability improvements (complete token scanning, text content modification, spec-compliant decoding), recognize current limitations (BODY context, bookmark limits), and prepare for future features (CSS selectors, structural modifications).\u2028<\/p>\n\n\n\n<p>-&gt; Implement production-ready patterns: integrate the HTML API with WordPress hooks (the_content, render_block, widget_text), write proper error handling for unsupported HTML, choose between Tag Processor&#8217;s speed versus HTML Processor&#8217;s structure awareness, and migrate existing regex-based code safely.<\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<span class=\"embed-youtube\" style=\"text-align:center; display: block;\"><iframe loading=\"lazy\" class=\"youtube-player\" width=\"640\" height=\"360\" src=\"https:\/\/www.youtube.com\/embed\/eBqdVaHChCo?version=3&#038;rel=1&#038;showsearch=0&#038;showinfo=1&#038;iv_load_policy=1&#038;fs=1&#038;hl=en-US&#038;autohide=2&#038;wmode=transparent\" allowfullscreen=\"true\" style=\"border:0;\" sandbox=\"allow-scripts allow-same-origin allow-popups allow-presentation allow-popups-to-escape-sandbox\"><\/iframe><\/span>\n<\/div><\/figure>\n","protected":false},"excerpt":{"rendered":"<p>By the end of this session, attendees will: -&gt; Understand when and why to use the WordPress HTML API instead of traditional methods like regex, str_replace(), or DOMDocument, including specific security vulnerabilities, HTML5 incompatibility issues, and performance problems each legacy approach creates. \u2028-&gt; Master WP_HTML_Tag_Processor fundamentals for memory-efficient, single-pass HTML parsing: core methods (next_tag, get\/set_attribute, [&hellip;]<\/p>\n","protected":false},"author":15407217,"featured_media":0,"template":"","meta":{"advanced_seo_description":"","jetpack_seo_html_title":"","jetpack_seo_noindex":false,"jetpack_post_was_ever_published":false,"_wcpt_session_time":1775806200,"_wcpt_session_duration":3300,"_wcpt_session_type":"session","_wcpt_session_slides":"","_wcpt_session_video":"","_wcpt_speaker_id":[11438],"footnotes":""},"session_track":[97],"session_category":[],"class_list":["post-11451","wcb_session","type-wcb_session","status-publish","hentry","wcb_track-track-3-enterprise"],"jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/pgnY82-2YH","session_date_time":{"date":"April 10, 2026","time":"01:00 pm"},"session_speakers":[{"id":"11438","slug":"hardik-thakkar","name":"Hardik Thakkar","link":"https:\/\/asia.wordcamp.org\/2026\/speaker\/hardik-thakkar\/"}],"session_cats_rendered":null,"_links":{"self":[{"href":"https:\/\/asia.wordcamp.org\/2026\/wp-json\/wp\/v2\/sessions\/11451","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/asia.wordcamp.org\/2026\/wp-json\/wp\/v2\/sessions"}],"about":[{"href":"https:\/\/asia.wordcamp.org\/2026\/wp-json\/wp\/v2\/types\/wcb_session"}],"version-history":[{"count":2,"href":"https:\/\/asia.wordcamp.org\/2026\/wp-json\/wp\/v2\/sessions\/11451\/revisions"}],"predecessor-version":[{"id":14003,"href":"https:\/\/asia.wordcamp.org\/2026\/wp-json\/wp\/v2\/sessions\/11451\/revisions\/14003"}],"speakers":[{"embeddable":true,"href":"https:\/\/asia.wordcamp.org\/2026\/wp-json\/wp\/v2\/speakers\/11438"}],"author":[{"embeddable":true,"href":"https:\/\/asia.wordcamp.org\/2026\/wp-json\/wporg\/v1\/users\/thakkarhardik"}],"wp:attachment":[{"href":"https:\/\/asia.wordcamp.org\/2026\/wp-json\/wp\/v2\/media?parent=11451"}],"wp:term":[{"taxonomy":"wcb_track","embeddable":true,"href":"https:\/\/asia.wordcamp.org\/2026\/wp-json\/wp\/v2\/session_track?post=11451"},{"taxonomy":"wcb_session_category","embeddable":true,"href":"https:\/\/asia.wordcamp.org\/2026\/wp-json\/wp\/v2\/session_category?post=11451"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}