TitleCasePro logo TitleCasePro

How to Extract URLs From Text — Links, Hrefs, and Bare Domains | TitleCasePro

How to pull all URLs and links out of any text — HTML source, Markdown, log files, or plain prose — and get a clean deduplicated list.

· 4 min read · Try Extract URLs From Text →

Quick answer: Paste any text into the URL extractor to instantly pull out every link — http://, https://, and bare www. domains — in a clean, deduplicated list you can copy or export as CSV.

Extracting URLs manually from a long document, an HTML dump, or a log file is tedious and easy to miss. An extractor scans the entire text in milliseconds and collects every web address it finds.

What URL Patterns Are Matched

The extractor recognises three common forms of web addresses:

PatternExample
HTTPShttps://example.com/path?query=1
HTTPhttp://legacy-site.com/page
Bare wwwwww.example.com (no protocol)

Query strings (?q=hello&page=2), paths (/blog/post-title), fragments (#section), and port numbers (:8080) are all preserved as part of the URL.

Trailing Punctuation Stripping

In natural prose, URLs often appear at the end of a sentence or inside brackets:

  • Visit https://example.com. → extracts https://example.com
  • See the docs (https://docs.example.com) → extracts https://docs.example.com
  • Link: https://a.com, https://b.com → extracts both without commas

The extractor strips trailing periods, commas, closing parentheses, angle brackets, and semicolons so you get clean, usable links.

URL vs URI — What’s the Difference

A URI (Uniform Resource Identifier) is any string that uniquely identifies a resource — it could be a URL, an email address (mailto:user@example.com), a file path, or a database reference. A URL (Uniform Resource Locator) is a specific type of URI that also includes the access method (the protocol). https://example.com is a URL. urn:isbn:0451450523 is a URI but not a URL.

This tool extracts URLs: web-accessible links that begin with http://, https://, or www..

Common Use Cases

Broken-link audits — Extract all links from a page or document, then paste the list into a link checker to find 404s. Much faster than manually crawling the HTML.

Backlink research — SEO tools export large reports as CSV files. Extract all URLs from a pasted export to isolate the domains or paths you need.

Content migration — Before migrating a website to a new domain, extract all internal links from the old content to build a redirect map.

Reference lists — Academic papers and technical documents contain dozens of cited URLs. Extract them all at once to build a bibliography or verify sources.

Log analysis — HTTP access logs contain one URL per line mixed with timestamps, IP addresses, and status codes. Paste the log and extract just the URLs.

Deduplicating link lists — Merge links from multiple sources, paste them all, and get a deduplicated set in one step.

Sorting and Deduplication

Turn on Remove duplicates to ensure each URL appears only once (first occurrence kept). Turn on Sort A → Z to alphabetise the list — useful when comparing two sets of links or building an ordered reference list.

CSV Export

The Copy as CSV button produces a single-column CSV:

url
https://example.com
https://docs.example.com/guide
www.another-site.com

This imports directly into Google Sheets, Excel, Airtable, or any tool that accepts CSV input.

Privacy

The tool runs entirely in your browser. Your pasted text is never sent to any server. This makes it safe for internal documents, private links, and confidential content.

Related articles