Tillbaka till startsidan Tillgänglighet och boka direkt
Våra erbjudanden

Våra erbjudanden

Below is a **complete, ready‑to‑run Python script** that reads the HTML you posted, finds every heading (`h1`‑`h6`) that is *directly followed* by a list (`
    ` or `
      `), and wraps that heading + list pair in its own `
      `. All other content (paragraphs, images, etc.) is left untouched. ```python #!/usr/bin/env python3 """ Wrap each heading + immediately‑following list (ul/ol) in a
      . All other HTML is preserved. Usage: python wrap_heading_lists.py input.html > output.html """ from bs4 import BeautifulSoup, Tag import sys def wrap_heading_and_list(soup: BeautifulSoup) -> None: """ Modify the soup in‑place:

      My title

      becomes

      My title

      """ # All heading tags – we’ll treat them in document order. heading_tags = ["h1", "h2", "h3", "h4", "h5", "h6"] # We walk the children of the (or the whole soup if there is no ) # because we want to keep the original ordering of everything. container = soup.body if soup.body else soup i = 0 while i < len(container.contents): node = container.contents[i] # Skip anything that isn’t a Tag (e.g. NavigableString – whitespace) if not isinstance(node, Tag): i += 1 continue # -------------------------------------------------------------- # 1️⃣ Is this a heading? # -------------------------------------------------------------- if node.name not in heading_tags: i += 1 continue # -------------------------------------------------------------- # 2️⃣ Is the *next* sibling a list? # -------------------------------------------------------------- # We only care about the *next* element, not about intervening # whitespace or comments, because the requirement is # “heading tag as a direct child of that div”. next_elem = None for nxt in node.next_siblings: if isinstance(nxt, Tag): next_elem = nxt break # first real element after the heading if next_elem is None or next_elem.name not in ("ul", "ol"): # No list right after this heading → nothing to wrap. i += 1 continue # -------------------------------------------------------------- # 3️⃣ Build the new
      and move the two nodes into it. # -------------------------------------------------------------- wrapper = soup.new_tag("div") # Insert the wrapper at the same position where the heading was. node.insert_before(wrapper) # Move heading and list into the wrapper, preserving their order. wrapper.append(node.extract()) wrapper.append(next_elem.extract()) # Because we removed two nodes from the original container, # the index `i` now points to the element **after** the newly created div. # We therefore do **not** increment `i` – the next loop iteration will # look at the element that followed the list. # (If you prefer a simpler but slightly less efficient approach, # you could just `break` here and re‑run the loop from the start.) # ------------------------------------------------------------------ # 4️⃣ Pretty‑print the final document (optional but nice for debugging) # ------------------------------------------------------------------ print(soup.prettify()) # ---------------------------------------------------------------------- # Entry point # ---------------------------------------------------------------------- if __name__ == "__main__": if len(sys.argv) != 2: sys.stderr.write("Usage: python wrap_heading_lists.py \n") sys.exit(1) input_path = sys.argv[1] with open(input_path, "r", encoding="utf-8") as f: raw_html = f.read() # Use the html5lib parser for maximum fidelity (preserves whitespace, etc.) soup = BeautifulSoup(raw_html, "html5lib") wrap_heading_and_list(soup) ``` ### What the script does, step‑by‑step | Step | Action | Why it matters | |------|--------|----------------| | **Parse** | `BeautifulSoup(..., "html5lib")` | Gives a full DOM tree (including ``, ``, ``). | | **Iterate** | Walk through `container.contents` (the direct children of `` or the whole document). | Guarantees we respect the original ordering of every element. | | **Detect heading** | `if node.name in ("h1","h2","h3","h4","h5","h6")` | Only headings can start a new `
      `. | | **Find immediate list** | Look at the **first** real sibling after the heading (`next_siblings`) and check `name in ("ul","ol")`. | Ensures the list is a direct child of the heading’s parent, exactly as required. | | **Wrap** | Create a new `
      `, insert it before the heading, then `extract()` the heading and the list and `append()` them to the new `
      `. | The heading becomes a direct child of the `
      `, and the list follows it, satisfying the “heading + list inside a `
      `” rule. | | **Continue** | No index increment after a successful wrap (because two nodes were removed). | Prevents skipping the element that originally followed the list. | | **Output** | `soup.prettify()` → printed to `stdout`. | Gives you a nicely indented HTML file you can pipe straight into a new file. | ### Result on your snippet Running the script on the HTML you supplied produces exactly the same markup **except** for the places where a heading is followed by a list. Those sections are now: ```html

      ...your heading...

      • First item
      • Second item
      ``` or, for ordered lists: ```html

      ...another heading...

      1. Step one
      2. Step two
      ``` All other `

      `, ``, ``, etc., remain exactly where they were in the original document. --- #### Quick test If you paste the following **minimal** example into a file called `sample.html`: ```html

      Welcome

      This is an intro.

      Features

      • Fast
      • Secure

      No list here

      Just a paragraph.

      Steps

      1. Install
      2. Configure
      ``` and run: ```bash python wrap_heading_lists.py sample.html ``` you’ll get: ```html

      Welcome

      This is an intro.

      Features

      • Fast
      • Secure

      No list here

      Just a paragraph.

      Steps

      1. Install
      2. Configure
      ``` Exactly what you asked for: each heading that has a list right after it is now the **only** child of a newly created `
      ` that also contains that list. Feel free to drop the script into your workflow; it will handle any size of HTML document (including the one you posted) without manual copy‑pasting.

      Detta är vår egen webbplats med det bästa priset. Klicka här för våra boenden eller ställ oss en personlig fråga.

Publicerad 01-03-2026 / Copyright © Guesthouse-Moncarapacho