Tuesday, December 13, 2011

Self-Publishing an Ebook, Part 3: Making a Basic .epub in Sigil

In this installment of my Self-Publishing an Ebook series, I will be going over some of the basics of using Sigil.  I will cover the basics of getting a book from the word processor into Sigil, basic formatting, and some trouble shooting.

Previous Topics:

This section will assume that you have installed Sigil on your computer successfully.  If you haven't done this yet, you may want to refer to the previous sections above to install this free program, and to become familiar with the interface.  This section also assumes that you have a finished book, in a word processing program.

Note: If you haven't written your novel, do that first.  Write it in a devoted word processor (a writing program, like Microsoft Word, Google Docs, etc.); these programs will at least help you catch spelling and grammatical errors.  Sigil won't catch spelling and grammatical errors, because it's not really a word processor--Sigil is designed to format and edit book files, but I do not recommend it for composing.  If you were planning to save time writing your book in Sigil to begin with...well, save yourself a boatload more time in editing by writing in a program that will catch the many nagging language errors you may not see until an adoring fan points them out.

Step 1: Get the book into Sigil

Alright; so, you have Sigil installed and a book that needs to become an .epub.  Fantastic!  The first thing you need to do is save a copy of your book--email it to yourself.  With the book now safely tucked away in the "just in case", open your book, and open Sigil.  Here is what Sigil should look like:


As the writer, right now, all you need to worry about is the big white screen called "Section0001.xhtml".  The file extension (.xhtml) stands for extensible hypertext markup language--the computer language your book will be coded in using Sigil.  However, remember that Sigil is a WYSIWYG editor, so you don't really need to know any XHTML to use it (well, okay, a very little--but nothing you won't learn here or with a simple Google search).


Before starting, be sure that Sigil is in "Book View" by selecting the open book icon. 


Now, copy and paste your book into Sigil.  Select all of the text of your book, copy it, and paste it into Section0001.xhtml.  

The first thing you may notice is that your book text and formatting are now messed up.  There may be uneven lines and margins, spaces between paragraphs, your italics/bold/underline are gone, and all of your images have been removed.  Unfortunately, Sigil isn't perfect; however, you will eventually need to proof this document against your original anyways, so that's when you will add and fix things.  Let all the missing junk slide for now, and focus on what is there.

Step 2: Add Chapters

If your book has chapters, or other such segment definitions, you will need to add them manually.  You can do this by finding where your chapters start, placing your cursor in front of your chapter header, and clicking the Chapter Break button.  If you started each of your chapters with something like "Chapter 1/2/3/etc.", then you can use the search function to make this a lot easier.

Use the "Find" function to find the text you used to indicate a new chapter or segment.


Use the "Chapter Break" button to make different chapters in your .epub.


Every time you use the "Chapter Break" button, a new .xhtml file will appear in the sidebar, and each one will have a default generated name--your second chapter will be "Section0002", the third will be "Section0003", etc.  While adding your chapter breaks, don't forget to make breaks for your book cover, copyright page, dedication page, author bio page, and anything else that will be another "section" of your book.

Once you have your chapters sectioned out, you can rename them to something more useful than the "Section000X" titles.  To rename a section, right click it in the sidebar, and select the "Rename" function.  When you rename, be aware that you need to keep the .xhtml file extension.  If you rename your file to something that Sigil doesn't allow, it will give you an error message, and you will need to try again. 

More helpful titles for these sections will be things that help you easily identify them.  For example, "Cover", "Chapter_1", and "Index".

Step 3: Fix the Paragraphs
Now, we are going to look at the underlying code of the file.  Some individuals reading this will already know XHTML and how it works; good for you!  

For those who aren't familiar with XHTML...
If you remember from school the human models with the chest flap missing, where you could see all the organs and bones and guts, this is going to be kind of like that.  Part of the audience will find this kind of neat, because it's the mechanics of what makes things work; the other part of the audience is going to get a little freaked out and overwhelmed as we look at a bunch of stuff we don't normally look at.  Both stances are fine; hopefully, when the code is demystified, it won't be nearly as overwhelming.


 Click the "Split View" button to view the finished text next to the code, or the "Code View" button to view the code alone.




In code view, this is what you will see.  Of course, this is the start of one of my books, Arrival of the Traveler; it has its own unique set of problems to fix.  What you see in the code window will be a little different, because your book is different, and it has a different formatting history.


All of the stuff at the top, the stuff that starting with "?xml version" and "DOCTYPE"--DON'T TOUCH THAT.  These little bits of code at the start can range from optional to required; it includes the document type declaration and standards conforming to good xhtml.  These are bits of code that you will likely never edit.  (If you want to play with them as a learning exercise, I encourage you to do so; don't perform such experiments on the only copy of your book, though.)

Alternatively, there are things you will need to edit.  Do you see the paragraph tags?  They come in pairs: the starting half is <p>, and the ending half is the same, but with a preceding backslash </p>.  You will find that all tags have a starting and ending tag in Sigil, and they usually sandwich something that comes in between them.  For example, the <head> tags wraps around the <title> tags; because having something in the middle isn't required, the <title> tags are there alone (but note that they still appear in a starting and ending pair--if this document had a title, it would be between those tags).  You might wonder now about the other half of the <body> tag; it does exist, but it is near the end of the chapter, because most of the text in the code will be in the <body> section of the code.  

Now, back to those paragraph tags, <p> and </p>.  They are a problem, because they are putting some unsightly spaces between my paragraphs; adding space before and after the block of text is an inherent property of the <p> </p> tags.  And, the paragraph tags aren't indenting the start of each paragraph.  This can be seen in the split view:


What we want to do with the paragraph <p> </p> tags is to replace them with <div> </div> tags.  A <div> tag is often used in conjunction with CSS (we will talk about CSS later); "div" is short for "division", as in a division or section, and not to be confused with mathematical division.  When using <div> </div> tags, you are dividing up your text into sections, i.e. paragraphs.  <div> </div> tags are better than <p> </p> tags, and MUCH better than hand formatting your document, because you will be able to go into your CSS document and change your paragraph setting one to change the style of the whole document (I will show you how).  

To change all of the <p> </p> tags to <div> </div> tags, we will use the "Replace" button.  When you click the "Replace" button, you will need to fill in two fields.  They are conveniently labeled "find what" and "replace with".  You will want to find <p>, and replace it with <div>.  Then, find </p>, and replace with </div>.  Unless you have a fondness for continually clicking the "Find Next" and "Replace" buttons, you may want to shorten the time it takes to do this by using the "Replace All" button.  

Note: You will need to do this find and replace procedure in each chapter .xhtml file of you book.  I do recommend doing this by chapter, instead of at the start before dividing into chapters, because handing Sigil a job that is too big can cause the program to freeze up.  Additionally, if you accidentally mess up the formatting of a single chapter, it's easier to copy that one section from the original and fix it than to start over on the whole document.

Step 4: Change the CSS

This step should actually be step 3.5, because it's function is to assist in fixing the paragraphs.  However, it has received its own section here because not everyone knows or is comfortable with CSS.  CSS stands for "Cascading Style Sheets", and it controls the style of your document.  To make a .css files for Sigil, you will need to right click the "Style" folder in the sidebar, and click "Add new item".  Then, open the new file you have just created in the Style folder, which will have the default name  "Style0001.css".  In the Style0001.css, type in the following code:

div {text-indent:50px; }

This little bit of code means that everywhere there is a <div> tag in your .xhtml files, the text will be indented by 50 pixels.  When this change is implemented, if 50 pixels looks like too much, you can come back to the .css file and change it to be less--and it will change throughout the entire book.  This is a huge time saver compared to changing the indent for each paragraph by hand.

Now take the following code:

<link href="../Styles/stylesheet.css" rel="stylesheet" type="text/css" />

Go back back to the code view of each of your .xhtml chapter files, and paste the code right after the <title> </title> tags.  It should look like this:

<title> (Your title goes here) </title>
<link href="../Styles/Style0001.css" rel="stylesheet" type="text/css" />

Save this change, and then go back and look at the Book View.  Your indents should be much more uniform now, and as previously stated, you can change them if you need to by adjusting the .css file.

CSS can be used to uniformly control a lot of the format of your book.  For example, you can control the justification of your text by using this code: body { text-align: justify; }.  You can control your margins with this code: @page { margin-top: 30px; margin-bottom: 20px; margin-left: 30px; margin-right: 30px; }.  If you think CSS is pretty cool, you can learn more about it here.


Step 5: Fix the Spaces

This step targets two problems.  The first problem has to do with spaces between sentences, and the second problem has to do with spaces anywhere else where they are not desired.  

It is likely that not everyone will have the first problem, but if you put two spaces between the end of one sentence and the start of the next, this applies to you.  The practice of using two spaces between sentences came from an old issue with typesetting.  However, this practice is now no longer needed; unfortunately, some individuals, such as myself, still compulsively hit that space bar more than once before starting the next sentence.  Where these extra spaces occur, they leave some awkward gaps around when displayed on an ereader:


See how the text doesn't quite justify correctly where the extra spaces are?  If the spaces weren't there, the text would be uniformly flush to the left.  It's subtle, but it's there, and it gets more exaggerated the larger the text is; it will drive certain perfectionist editor friends loony (credit to my frieditor Monique, who took the screen shot above).  That's the reason the extra spaces must go.  

To get rid of them, go back to the code view screen, and use the "Replace" button.  In the "find what" field, write "&nbsp;" (without the quotes, but with the semicolon and ampersand; if you're curious, "nbsp" stands for "non-breaking space").  Because we want to just remove these code bits, and not replace them with anything, leave the "replace with" field empty.  

On to the next problem--awkward spaces that occur anywhere else in your text.  Going back to the split screen image, we can see that not all of the spacing between my paragraphs was caused by using paragraph tags <p> </p>:


Near the bottom of the coding, there is a little bit of code:

<p><br /></p>

Even if the <p> </p> tags have been changed to <div> </div> tags, there will still be a break there because of another spacing tag: <br />.  The break tag <br /> inserts a hard break into the text.  Because it's function is purely to add a space, and not to modify any text, it is not a paired tag.  These tags, and some others, can be a little tricky to weed out.  As a writer, I sometimes actually want a hard break in my chapter to distinguish a change in scene; however, when Sigil mistakenly adds them for whatever reason when a new text is pasted in, that makes a problem.  How do I weed out the <br /> tags that I want from the ones I don't?

I don't have a canned solution for this one.  Because I don't want 98% of the <br /> tags in the code, I usually use the "Replace" button to remove them all, and then add back in the ones I wanted when I go back to proof my .epub against my original.  


Step 6: Compare to Original

Now is the time when you will worry about anything that Sigil removed when you copied and pasted it in.  Sigil may remove or change things like font size, bold, italics, underlines, etc.  You will need to do a text-to-text comparison with the original, and add these things back in using the interface in the book view.  It is tedious and time consuming, but at the same time, it is also good practice to look over your .epub page by page to be sure nothing has been lost in translation to the .epub.

As I said near the start, your text and code are likely to have issues I haven't explained here.  However, with a basic knowledge of the way the xhtml code tags work and a split screen view, you will likely be able to get a good guess that those codes appearing right where your text format gets weird are undesirable.  Go to Google and type in "what does the (insert code here) do"--that will help you determine if those codes should be removed to improve your .epub to the way you want it to look.


Of course, if you're seeing something really weird and hard to fix, I would love to see it.  Even if it's purely for the joy of saying, "Huh.  How did THAT happen?"  I may be able to help, so don't hesitate to shoot me a comment or an email. 

Until the next installment, Happy .Epub-ing!

Al 

2 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. Thanks for sharing those steps! Now I'm definitely gonna publish my ebook! I also had some troubles with coding, but it takes some time to understand how it all works. I found a lot information there sitechecker.pro/what-is-xhtml/ and I'm glad to share this with you!

    ReplyDelete