Unpublish asset when renamed or moved to another folder upon ...

Favorite-off Notification-off
42 Thumb-up Thumb-down
Idea by tim.reilly about 1 year ago Favorite-off - Open

This idea is a continuation of the feature request located here: http://support.hannonhill.com/browse/CSNEW-77

Sometimes the asset’s end date will trigger a move to another folder
for “expired” content. The asset should probably be unpublished before
it is moved.

The same can be said for assets that are renamed. Currently, when
an asset is renamed, the old asset remains on the server and does not
get unpublished.


17 comments

  • 2 points Thumb-up Thumb-down Favorite-off by len.lanphar about 1 year ago Permalink
    One thing I'm really curious about: does anyone have any use cases where a published asset should NOT be unpublished when it is renamed/moved/expired? If so, maybe what's needed is a reframing of the problem. I think more generally speaking, there needs to be some way to keep published files from being orphaned. Here's a quick list of ideas off the top of my head for different ways this might be accomplished: 1. Automatically unpublish when renamed/moved (i.e. the original request) 2. Leave some sort of "ghost" asset at the original path with an appropriate icon such that the user knows that the asset is gone from that location in the CMS but that there might still be work to do before making it go away completely. This ghost asset could have action options along the lines "check for links to this me" and "unpublish and delete me". 3. When publishing a folder to a destination, have the ability to automatically "sync" that folder and remove anything the CMS finds that shouldn't be there. 4. Have some sort of "find orphaned objects" tool for destinations that will do more of an interactive sync.
    • 0 points Thumb-up Thumb-down Favorite-off by mike.strauch 8 months ago Permalink
      I think the only problem with ideas 3 and 4 is that we can't assume that all of the files located on the destination server are Cascade managed.  It's probably unlikely that files in a particular directory on a destination are not Cascade managed, but there's no way to tell.
  • 0 points Thumb-up Thumb-down Favorite-off by wjoell about 1 year ago Permalink
    As I understand it, the issue lies with the published assets on the destination server and not so much with managing links internally. If a file or group of files or entire directories are renamed and/or moved, the links internally should get picked up without a problem. What I would like to see is the same dialog that comes up prompting to unpublish when an asset is deleted to come up for any move or rename of a published asset. As for expiring assets, there should be an option to specify unpublish when setting up the expiration.
  • 0 points Thumb-up Thumb-down Favorite-off by srutland 11 months ago Permalink
    This may be an oversimplification of the case for deleting files on the destination site that have been renamed in Cascade, but what is the possibility of having a file syncronization method in Cascade?  That is, a method for comparing assets (publishable files) on Cascade with the assets that are published?   For example, in Dreamweaver, when publishing from a local site (loosely equated to a site within Cascade), you're given the option of deleting files on the destination site which don't exist on the local site.
    A syncronize between the local and remote (e.g. between Cascade and the published site) would reveal these artifacts and let the site manager know what needs to be cleaned up.
    • 0 points Thumb-up Thumb-down Favorite-off by roryreiff 8 months ago Permalink
      Yes, this would be a nice solution to this problem.
  • 0 points Thumb-up Thumb-down Favorite-off by bryanzera 5 months ago Permalink
    What is the logic for not removing files from the destination when a file is renamed or deleted?
  • 0 points Thumb-up Thumb-down Favorite-off by johnsons4.scranton about 1 month ago Permalink
    This would great for us because of time and date sensitive material and it's availability on search engines.  We have press releases about events or open house dates from '08 and '09 that we don't want people to find in our internal search results or Google, Bing, etc. search results and get confused.

    Scenario:
    If we set these articles to expire, even if they move out of the publishable folder, the articles still exist on the server, thus they will still get picked up by search bots.

    When enabling expiration on an asset, it would be very useful to have an "unpublish" check box that would allow us to decide if the asset should be unpublished.
  • 0 points Thumb-up Thumb-down Favorite-off by bradley.wagner about 1 month ago Permalink
    Had a conversation with len.lenphar recently in which I suggested a few possible solutions. All of these begin with actually un-publishing (or giving the users the option to) the asset when it is either moved/renamed. That part is not particularly difficult but is only one aspect of the problem. The other issue is how to deal with:
    • all of the links within CMS managed pages that will work in Cascade but be broken on live site until re-published
    • any inbound links from non-CMS managed sites, references from other sites, search indexed content

    I had suggested the following solutions (in order of increasing difficulty and sophistication):
    1. Do nothing. Basically, leave it up to the site managers to re-publish their site regularly to fix any out-of-date links. While easy, it's not particularly intuitive. Also it doesn't elegantly address the problem of those moved pages having been picked up by search engines or other sites in the past.
    2. Either prompt the user to or automatically re-publish the site containing the asset. This is based on the assumption that there won't be that many links to the content outside of the site is on. Of course, this doesn't hold up if you change the name of a prominent top-level page (e.g. 'contact' -> 'contact-us'). It also doesn't address the search engine issue.
    3. Replace the asset on the web server with some kind of "redirect asset" that gets published as PHP or some other scripting language and issues a 301 Redirect. This would handle the links coming from out-of-date content until they're republished. This would also address the search engine/inbound links from outside issue. NOTE: a meta-refresh in HTML is not the same as a 301 redirect done with .htaccess or a scripting language from an SEO standpoint.
    4. Actually determine all the assets that link to the moved/renamed asset in Cascade and automatically re-publish all of them. You still would run into problems with search engine indexed content and outside links to the content's old location.
    I'm interested to get your feedback on how much of an issue the broken links and search indexed content are for your organizations.

    I'm also curious how much of this the CMS should automate.

    #1, for example, would be fairly easy for us to implement but may not provide the level of automation that you are expecting.

    I'll let len post his thoughts here as well.

    • 0 points Thumb-up Thumb-down Favorite-off by ericepps about 1 month ago Permalink
      My vote would be option #3. Except that a 301 should only be used if content is being moved to an archive folder (this would be great for our news articles).

      If content is being removed completely, a 404 would be the appropriate response. Chances are, if content is marked to be unpublished after a time, the site owner is not concerned with or doesn't want links maintained to that page (this would be great for our job postings). Site owners should map the "End Date" field to an "Expires" header/meta tag for SEO (perhaps this can be made easier?).
      • 0 points Thumb-up Thumb-down Favorite-off by ericepps about 1 month ago Permalink
        Oops, sorry, didn't re-read the heading before I posted the reply. Content being removed after expiration isn't the main topic of discussion here, but seems relevant, here.
    • 0 points Thumb-up Thumb-down Favorite-off by len.lanphar about 1 month ago Permalink
      An excerpt of my discussion with Bradley...

      It sounds like one of the major challenges is in answering the "what pages link to this asset" in a manner that is both accurate and fast. What if, on every page render, you have a cached_references table that you update with a bunch of rows mapping that page to all the known assets it links to as of that rendering. That would give you the ability to do a very fast "what was the last known set of links to this asset" query. It wouldn't be 100% accurate, but it would cover an awful lot of cases.

      There's always going to be that risk of not knowing what exactly's out there unless you crawl the published site and look at everything. For example, what if we have page A that links to page B and is then published. Then page A is edited to no longer link to page B, but is not republished. Now let's say you move page B to a different folder. As far as the CMS is concerned, page B is no longer of interest to page A, but the published site tells a different story. The only way to get around this kind of stuff would be to do something like include versions in the cached_references table and then also maintain another table linking assets to destinations to last version published at that destination, giving you a slightly better picture of what the state of that published asset is at each destination.

      I would advise against any sort of automated republishing of assets.  We oftentimes have users who are in the process of working on a page, and if things are automatically republished without their knowledge we may end up with half-finished pages hitting the live site.  I would instead suggest going towards more of a "dirty asset" concept. If an asset's last-modified date is > its last-published date, it's dirty. If an asset has been created but not yet published, it's dirty.  If an asset links to a page that has been moved, it's dirty. And so on. Then one could get some sort of "show me all the dirty assets" view to see a list of things in that state and fire off whichever ones they want to the publisher.

      Oh, tangentially related to this I should clarify what I was referring to by ghost assets. I wasn't thinking of them being redirect pages so much as pseudo-assets that exist only in the CMS as a cue to content providers. Let's say I rename foo to bar, the page would be moved as it is now, with no unpublishing, but there would also be a "marker" asset at foo notifying CMS users, "hey, there's this thing that was moved and you're not done dealing with it yet." The way I was envisioning its use was to be another dirty asset. When something is renamed, the old file has to be republished, the new file published, and any pages with references to the old file republished. I was trying to think of some way to better synchronize those 3 operations without them necessarily being triggered automatically at the time you do the rename, which may have unintended consequences from the content owners' perspective.

      • 0 points Thumb-up Thumb-down Favorite-off by lroberson about 1 month ago Permalink
        Excellent, Len. It's obvious you've put a lot of thought into this. I figured Cascade already had a xref table like you describe, so it wouldn't be that hard to extend it.

        When speaking of "dirty" assets, I often wish this information were already available to us. I understand that there is an is-published node in the index block result set, but my fellow developers tell me it is unreliable, so I guess if they are right, the mechanism that does something similar to that is broken in the product already.

        We have some very active and -- ahem -- bored site owners who will go about adjusting their site quite a bit from time to time, then leave it for a few weeks, then come back, so I also advocate for the least amount of aggressive publishing.
  • 0 points Thumb-up Thumb-down Favorite-off by wabaus about 1 month ago Permalink
    I like option #3 pretty well.  It leaves some debris until the next time we do a clean full publish -- but it's useful debris, not stale content. And it helps search-engines figure out the change.  So we'd have to do the clean full-publish less often.

    Instead of creating several versions of the 301 page (php, asp, etc.), why not give us (Cascade admins) a code block to write it ourselves in our favorite language (with a system variable we'd insert to drop the new URL into our code).
  • 0 points Thumb-up Thumb-down Favorite-off by johnsons4.scranton about 1 month ago Permalink
    I like #3 as well.  In the scenario I posted above, I wouldn't mind if it went to a 404 not found page (I'm going to be working on a custom 404 page in the near future with useful goodies on it!), though a 301 back to the main press releases page would be awesome.  To my understanding, we're using a .htaccess file for all our 301s, but I think it's in ASP.    If there was a way in the CMS to create a permanent (301) redirect (not javascript) in the place of an un-published page that will not be replaced, that would be a good option!

    We wouldn't be a fan of broken links.  It's better to republish the affected pages or entire folder.  As far as cached pages in search...if we had an easy way to add the permanent redirect, that wouldn't remain an issue for too long.  We also could use Google Webmaster Tools to eliminate it from Google.

    I'm sure this is a very complex problem, requiring an even more complex solution.  Let me know if you want more details on our scenario or other feedback.
  • 0 points Thumb-up Thumb-down Favorite-off by len.lanphar about 1 month ago Permalink
    In my mind I've started conceptualizing this as two separate, but related, problems. The first is that when published assets are moved that movement happens independently of the published site and those published assets become orphaned. The second is that when an asset is moved, other published assets out there linking to it still link to the asset's original location.

    For us at least, problem #1 is much more serious for us than problem #2. While we'd obviously like to see both addressed, if push comes to shove and we're faced with the choice of having a version of the product that addresses #1 within a few months vs. waiting significantly longer than that for a version of the product that addresses #1 and #2, I vehemently vote for the first option.
  • 0 points Thumb-up Thumb-down Favorite-off by TimothyGilman about 1 month ago Permalink
    (Note: The comments below address the issue of orphaned assets when an asset is moved in Cascade.  There is also an idea for maintaining links to the asset's original location.  Not addressed are search engine and redirect issues.)

    I agree with len.lenphar that automatically unpublishing and/or republishing assets would be problematic.  Some cases where an (immediate) unpublish is NOT desired are...

    • Making updates to an asset, including changing its location (i.e., parent folder).  Would want to do make all the changes before publishing it.  Therefore a publish that is triggered on the path name change alone would not be desired.
    • Temporarily changing the path name of a page or a folder, for the purposes of testing how a page works once it is published, but before "going live" with it.  In this case, we would intentionally opt out of any cleanup to the published assets.  (This is a use case for Admins primarily.  We, generally, know what we are doing and what cleanup needs to happen on the published server.)

      The concept of the "dirty" asset is intriguing.  I understand that this would work such that on a rename of an asset, the original would remain as "dirty" with the only (or primary) action available being an unpublish.  Here are some other ideas:

    • Similar to when an asset is deleted, when an asset is renamed, Cascade could give the user the option of unpublishing the original at the time the asset is moved.
    • A little more complicated, but probably better, is that upon publish of an asset, Cascade would check to see if the name and/or path had changed since the last publish, and if so, give the option to unpublish the original.  Additionally, this strategy could also be used to detect any assets that link to the current asset, and if the current asset has a different name/path since the last time it was published, Cascade could give the option to republish those assets (perhaps providing checkboxes next to the list).

    The value of these ideas is twofold.  One, it provides the capability to unpublish the old asset as part of the move process.  Two, it alerts the end user to the fact that a republish and/or unpublish is necessary.

  • 0 points Thumb-up Thumb-down Favorite-off by emumpton about 1 month ago Permalink
    TimothyG., this is a good example of where automation is not what should be desired but a method to communicate ("... alerts the end users...") or report the results of an action for later resolution by the site owner/manager.

No attachments


* Note - To input code samples, click the pencil icon (this will remove the WYSIWYG) then be sure to start and end code sections with @@@ (three '@' signs).  For more information on textile markup, click here.  

Sign in

Follow us on Twitter