Goal
Using VBA copy all text from MS Word file to a single cell in MS Excel, while preserving the text formatting.
Caveats:
- Text may contain return carriage / linefeed characters (see Section below on Return Carriage /Linefeed)
- Character formatting must be preserved as derived from the MS Word Compare function including: red strikethrough for deleted text, green underline for added text.
- Documents do not contain: Section breaks, headers/footers, tables (inline & floating), footnotes, endnotes, indents, tabs, etc.
- Page layouts and paragraph formatting does not need to be preserved.
- I know there are other solutions outside of VBA for text comparison, but the end users are desperate to have this in Excel. Save those for another question. Answers outside of VBA will be down voted.
Carriage Return / Linefeed: MS Products don't play nice
In short, Word uses CHAR(13), or ^p if searching via Find within Word. Whereas Excel uses CHAR(10) for CrLf within a string. If text is copied from Word and pasted to Excel, and Excel sees CHAR(13) in the pasted text it is interpreted as "at this point move to the next cell and then continue pasting text." When run in a loop this starts to cause issues.
First Iteration of the Loop
The end result is that only the first paragraph, i.e. anything before the first CHAR(13), of the first text comments ends up in the intended cell. Each subsequent paragraph ends up in the same column but down one row.
| rec_num | Comments |
|---|---|
| 1 | R1.Para1 |
| R1.Para2 | |
| ... | |
| R1.ParaN |
After all iterations of the Loop
On the next iteration of the loop, the first paragraph is written in the intended cell (same column but one row down from cell for the previous iteration) but overwrites the second paragraph from the previous iteration of the loop. This goes on for all comment comparisons in the loop. Hence, only the first paragraph is ever captured to the intended cell.
| rec_num | Comments |
|---|---|
| 1 | Rec1.Para1 |
| 2 | Rec2.Para1 |
| ... | ... |
| N-1 | RecN-1.Para1 |
| N | RecN.Para1 |
| RecN-1.Para3 | |
| ... | |
| RecN-1.ParaK |
Half Fix
You can use the Replace() function in VBA, or = SUBSTITUTE() in an excel formula to resolve this, but then text formatting is lost (see below for VBA Code examples)
| rec_num | Comments |
|---|---|
| 1 | Rec1.Para1 Rec1.Para2 ... Rec1.ParaK |
| ... | ... |
| N | RecN.Para1 RecN.Para2 ... RecN.ParaK |
For further details see Preserve line breaks when inserting/pasting into a single Excel cell
Attempted Solutions
The following attempted solution preserves text formatting, but generates the CrLf issue:
' Lots of previous code. most important part is below ' Run the Compare function in MS Word between doc1 and doc2 to get marked-up text in Word objDoc.Activate objDoc.Compare Name:=objDoc2.Name, CompareTarget:=wdCompareTargetCurrent objDoc.Save objDoc2.Close ' Copy compare text from doc1 objWord.Selection.WholeStory objWord.Selection.Copy ' Pate formatted text from doc1 to a cell in Excel ActiveSheet.Paste Destination:=ws.Range("F" & row_num) objDoc.Close This attempted solution solves the CrLf issue, but does not preserve text formatting:
' Lots of previous code. most important part is below ' Run the Compare function in MS Word between doc1 and doc2 to get marked-up text in Word objDoc.Activate objDoc.Compare Name:=objDoc2.Name, CompareTarget:=wdCompareTargetCurrent objDoc.Save objDoc2.Close ' Copy compare text from doc1 and paste the text into a singe cell in Excel If objWord.ActiveDocument.Paragraphs.Count >= 1 Then objWord.Selection.WholeStory ' Yes I did try objWord.Selection.FormattedText without success :( str1 = Replace(objWord.Selection.Text, Chr(13), Chr(10)) ws.Range(col & row_num).Value = str1 End If Partial Answers from StackOverflow
- Excel VBA copy and paste selected text from word
- Cannot get WholeStory to select & format all text in a Word Doc using Excel VBA
- Word VBA to solve linebreak or paragraph mark issue when copy & Paste into excel (this might work, but seems clunky)
没有评论:
发表评论