
kewakl (Customer) asked a question.
If it is not too difficult, please remove formatting (color/font/typeface) tags from imported legacy content. Maybe we could run the imported legacy content though some REGEX filtering that could remove all of the old forums formatting tags:
[COLOR=".."], [FONT=".."], [LEFT], [B], [I],[U]... and matching closing tags
Trying to read the 'content' with all of the unsupported formatting codes is a struggle for me.
We NEED A COMPLETE ARCHIVE of the original forum. There is so much valuable (even if old) information to be lost there!
The clock is ticking. August 13 -- the legacy forum goes dark and with it goes a wealth of information that could not be imported, may not migrate to the KB, or will remain unused due to formatting of the imported text.
I am not knocking on AD for their work done to import the bulk of the old forum, just hoping that a bit more can be done (over time) to make the results of their efforts more user-friendly.
Original post
Imported content
Somewhat cleaned up revision
Thank you for making it this far!
I have proofed some regular expression in Notepad++ V7.5.9.
[(?i)[size]="[0-9]*]
[(?i)[//size]]
[(?i)color="[a-z ]*"]
[(?i)[//color]]
[(?i)[font]="[a-z ]*"]
[(?i)[//font]]
Breakdown - until we have access to a fixed-width font, I have to have a way to show whitespace. I hope that this formatting works.
[(?i)[font]="[a-z ]*"] <- this matches any BB Code in the format [FONT="fontname"] without respect to case. Font, font, FoNt all match.
[ <opening regular expression
(?i) <ignore case in regular expression searching
[font]=" <original forum BB Code for setting a font
[a-z ]* <regular expression to match zero or more alphabetic characters
" <closing quote from BB Code
] <closing regular expression
I agree, having all of that font declaration stuff show up is just clutter. I would not want to have to sift through that stuff to find the really good content that the community has built up over the years.
I hope they can make some progress on this.
The missing image content and broken links are also distressing, as there are certain things best expressed visually, and continuity to relevant threads is also useful.
It's been a month and neither of those issues have been fixed yet. I'm worried they ran a one time script, just extracted the text, and called it a day -- with literally years of content effectively deleted or significantly degraded (images missing, un-findable, and poorly formatted, this transition is greatly disappointing.
I sent a message to the AD web form today, hoping they'll hear the request, as different people may read those messages than the ones who maintain the forums. Fingers crossed.
We have been working to clean the unsupported characters out o f the legacy post and made our first load last night. This should clean up a majority of the characters that cluttered the text and fix a lot of the links that were broken with syntax. Of course links that are no longer valid ( which includes the old forum) will not work. We are trying to rebuild some threads from the old forum that have been mentioned specifically.
We have over 100, 000 posts to work through so it takes a bit of time to find and fix everything.
Its good to know that its being worked on!